Decentralized AI for the Rest of Us

Decentralized AI for the Rest
of Us
Jesus Rodriguez
Chief Scientist
Invector Labs

About Me
• Founder and Chief Scientist at Invector Labs(http://invectorlabs.com )
• Next generation software development on-demand agency
• Focused on deep technologies (AI, blockchain…)
• Speaker, author (https://medium.com/@jrodthoughts)
• Investor, board member on over a dozen deep-tech companies

Agenda
• The challenges of centralized AI
• Decentralizing AI: risks and promises
• Foundational blocks
• Homomorphic Encryption
• GAN cryptography
• Secure multi-party computations
• Federated Learning
• Blockchains/Tokens
• Existing decentralized AI platforms

Some References
• Why Decentralized AI Matters Part I: Economics and Enablers:
https://medium.com/datadriveninvestor/why-decentralized-ai-matters-
part-i-economics-and-enablers-5576aeeb43d1
• Why Decentralized AI Matters Part II: Technological Enablers:
part-ii-technological-enablers-a67e3115312e
• Why Decentralized AI Matters Part III: Platforms:
part-iii-technologies-930c3c9d10d
• AI Has Not One, Not Two, but Many Centralization Problems:
https://hackernoon.com/ai-has-not-one-not-two-but-many-
centralization-problems-a5f0664361ed
•

Decentralized AI: Hype vs. Reality

We Have Hundreds of Millions of Devices
Running AI Models

Decentralized Architectures Are Here to Stay

The Evolution of Economic Movements
Centralized
Decentralized
Recentralized

The Centralized Nature of AI…

Four Centralization Vectors of AI
Data Models
Training
Regularization-
Optimization
AI
Centralization
Vectors

The Data Centralization Problem

The Model Centralization Problem

The Regularization-Optimization Centralization
Problem

Let’s Get Philosophical:
Risks of Centralized AI…

The Risks of Centralized AI
The
Decentralized
Knowledge –
Centralized AI
Friction
The Rich Get
Richer Problem
Risks of
Centralized AI
The
Transparency
Influence
Ratio

The Rich Get Richer Problem
Big Companies Have a
Data Advantage Over
Startups When It Comes
to AI
A Big Company Decides
to Create Different AI
Models Against Their
Proprietary Datasets
AI Models Produce Even
More Proprietary Data
and Intelligence

Decentralized Knowledge vs. Centralized Models
How can we know if
the knowledge of an
AI model is correct?
Knowledge is an
intrinsically
decentralized
activity
But companies
insist on
constraining AI
models to
proprietary datasets

The Transparency/Influence Ratio
AI algorithms are
increasingly
important in our
lives?
But we know very
little about them?
Who trains it?
How was it
trained?
How does it build
knowledge?

Makes sense…
Let’s Decentralize the Whole Thing…

Challenges to Achieve AI Decentralization
• Can third parties be
correctly incentivized to
contribute to the
knowledge and quality
of an AI model?
• Can the activity and
behavior of an AI model
be transparently
available to all parties
without the need of
trusting a centralized
authority?
• Can models be
distributed and
executed autonomously
across hundreds of
thousands of nodes?
• Can entities train a
model without having
to disclose their data?
The Privacy
Problem
The
Autonomy
Problem
The
Economic
Problem
The
Transparency
Problem

Trends Influencing Decentralized AI…

Data Privacy in Decentralized AI Architectures…
You know….
security and
that stuff

The Challenges
• How to share data with
different parties while
maintaining certain levels
of privacy?
• How to perform
computations over
encrypted data?
• How to become resilient
to privacy attacks in a
decentralized network?

Homomorphic Encryption
• Alice wants workers to assemble
raw materials into jewelry.
• But Alice is worried about theft:
• She wants workers to process
raw materials without having
access.
• Alice puts raw materials in a locked
glovebox.
• Workers assemble jewelry inside
glovebox, using the gloves.
• Alice unlocks the box to get “results.”

A More Practical Example
Alice
Server
(Cloud)
(Input: data x,
secret key sk)
“I want 1) the cloud to process my data
2) even though it is encrypted.”
Encpk[f(x)]
Encpk(x)
function f
f(x)
Run
Eval[ f, Encpk(x) ]
= Encpk[f(x)]
The special sauce! For security
parameter λ, Eval’s running should
be Time(f)∙poly(λ)
This could be
encrypted too.
Delegation: Should cost less for
Alice to encrypt x and decrypt f(x)
than to compute f(x) herself.

Full vs. Partial Homomorphic Encryption
• Given the encryption of two data primitives a and b: E(a) and E(b)
• Partial homomorphic encryption can compute E(a+b) or E(ab) without
knowing a, b, or the private key
• Unpadded RSA https://en.wikipedia.org/wiki/RSA_cryptosystem
• ElGamal https://en.wikipedia.org/wiki/ElGamal_encryption
• Paillier https://en.wikipedia.org/wiki/Paillier_cryptosystem
• Full homomorphic encryption can compute both E(a+b) and E(ab)
• Gentry
• van Dijk-Gentry-Halevi-Vaikuntanathan

Homomorphic Encryption and Decentralized AI
HEnc(Dataset)
HEnc(Results)
Data Scientists

Homomorphic Encryption Today
• Remains mostly a theoretical exercise
• Companies such as IBM and Microsoft have released research
frameworks for partial homomorphic encryption
• HElib https://github.com/shaih/HElib

Adversarial Neural Cryptography…

GAN Cryptography
• Simpler alternative to homomorphic encryption
• Uses generative adversarial neural networks (GANs) to protect
communications between two parties
• The encryption algorithms evolve dynamically with the performance
of the networks

GAN Cryptography
• Alice and Bob are neural networks
trying to communicate securely while
Eve tries to break the communication
• The outputs from Eve (Peve) are
factored in the lost function for both
Alice and Bob
• Alice and Bob both discover new
encryption algorithms that can
defeat the best version of Eve

GAN Cryptography and Decentralized AI
GAN Data
Provider
GAN Data
Scientist
GAN Data
Scientist
GAN
Listener
Enc(Dataset)

GAN Cryptography Today
• Remains mostly an active research area
• Some lightweight implementations like Numerai are available

Secured Multi-Party Computations…

sMPC
• Solves the traditional millionaires’ problem for n-parties
• A group of millionaires, are interested in knowing which of them is
richer without revealing their actual wealth. This problem is analogous
to a more general problem where there are two numbers a and b and
the goal is to solve the inequality without revealing the actual values
of a and b.
• Enable secure computations in a network without a trusted party

sMPC
• A set of parties with private inputs
• Parties wish to jointly compute a function
of their inputs so that certain security
properties (like privacy and correctness)
are preserved
• Properties must be ensured even if some
of the parties maliciously attack the
protocol
• Examples
• Secure elections
• Auctions
• Privacy preserving data mining

sMPC and Decentralized AI
HEnc(Dataset)
HEnc(Results)
Data Scientists

sMPC Cryptography Today
• Several sMPC frameworks and libraries available
• The Enigma blockchain (https://enigma.co/ ) is one of the most
complete and scalable sMPC implementations in the market

Solving Decentralized Learning…
How to
learn
without the
man?

The Challenges
• How to execute machine
learning models on mobile
or IoT devices?
• How to customize a
machine learning model
based on personal data?
• How to improve a
machine learning model
based on executions
across a large number of
devices?

Federated Learning and Decentralized AI
AI
Execution
Node
AI
Execution
Node
AI
Execution
Node
AI
Execution
Node
Federated
Training
Node
Federated
AI
Training-
Execution
Node
AI
Training-
Execution
Node
AI
Training-
Execution
Node
AI
Training-
Execution
Node
AI
Training-
Execution
Node
Decentralized

Federated Learning Today
• There are several implementations available on TensorFlow Lite
• Google has been testing federated learning on Gboard
https://www.blog.google/products/search/gboard-now-on-android/

A Runtime for Decentralized AI…
How do we
run these
things?

Challenges
• How to train models without a
centralized authority?
• How to use trained models without
trusting a central party?
• How to validate and enforce trust
among the different parties in the
lifecycle of a deep learning model

Blockchains and Smart Contracts…

Blockchains and Smart Contracts

Smart Contracts and Decentralized AI
Data Scientist
(builds and
train model)
Blockchain
Smart
Contract
Blockchain
Smart
ContractCompany
Data Scientist
(tests and
optimizes
model)
Data and Success
Criteria
Blockchain
Smart
Contract

Solving the Incentive Problem…
How to pay
the data
geeks?

Challenges
• How to build incentives for data
scientists training, testing and building
models?
• How to build incentives for parties
contributing datasets?
• How to discourage bad behaviors in
the network?

Tokenized Decentralized AI
Data Scientist
(builds and
train model)
Blockchain
Smart
Contract
Blockchain
Smart
ContractCompany
Data Scientist
(tests and
optimizes
model)
Data and Success
Criteria
Blockchain
Smart
Contract
Tokens
Tokens

Decentralized AI Platforms…
Awesomeness

SingularityNET
• Network powering Sophia
• DApp Marketplace: The SingularityNET
DApp is the entry-point to discovering
and using AI services on the
SingularityNET Network
• SNET Registry: The SingularityNET
Registry is an open and uncensorable
registry of Organizations, AI Services, and
Type Repositories that are accessible
from within the SingularityNET Network.
• Service Daemon: The Service Daemon
exposes an AI developer's application as
an API that is accessible through the
SingularityNET Network.

OpenMined
• Open source community focused on creating
decentralized AI protocols
• Sonar— A federated learning server running on
the blockchain that handles all campaign requests,
holding Bounty in trust.
• Capsule— A third-party PGP server to generate
public and private keys in order to ensure that
Sonar neural network stays encrypted properly.
• Mine— The individual data repositories of a user.
These are constantly checking Sonar for new
neural nets to contribute to.
• Syft— The library containing Neural Networks that
can be trained in an encrypted state (so that
Miners can’t steal the neural networks that they
download to train).

Algorithmia DanKu
• Created by Algorithmia
• Decentralized protocol for
running machine learning
contests
• Users can publish datasets using
smart contracts
• Models are trained on the
datasets and submitted to the
blockchain
• Other models evaluate the results
and compensate the users

Ocean
• Ocean Protocol is an ecosystem for sharing data
and associated services
• Providers: These actors have AI data or services
that they make available in a cryptographically
provable fashion.
• Marketplaces: Data/service marketplaces are
typically how providers and consumers interact
with Ocean network, for convenience.
• Data commons interfaces: Side-by-side with data
marketplaces that serve priced data are interfaces
for data commons, for free or commons data.
• Keeper: Keepers are responsible for collectively
maintaining the network. Anyone can run an
Ocean keeper node; it’s permissionless.
Participation is open and anonymous.

Summary
• The future of AI is likely to be more decentralized AI
• Achieving decentralized AI requires 3 key building blocks
• A decentralized runtime to execute computations
• Decentralized and scalable learning methods
• Strong privacy protocols for the exchange of data and models
• Blockchains have become catalyzers for the implementation of
decentralized AI architectures
• It’s happening already

Thanks
jr@invectoriq.com
https://medium.com/@jrodthoughts
https://twitter.com/jrdothoughts

Decentralized AI for the Rest of Us

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Decentralized AI for the Rest of Us

Similar a Decentralized AI for the Rest of Us (20)

Más de Jesus Rodriguez

Más de Jesus Rodriguez (20)

Último

Último (20)

Decentralized AI for the Rest of Us