Apache Knox Gateway is a proxy for interacting with Apache Hadoop clusters in a secure way providing authentication, service level authorization, and many other extensions to secure any HTTP interactions in your cluster. One main feature of Apache Knox Gateway is the ability to extend the reach of your REST APIs to the internet while still securing your cluster and working with Kerberos. Recent contributions to the Apache Knox community have added support for Single Sign On (SSO) based on Pac4j 1.8.9 which is a very powerful security engine which provides SSO support through SAML2, OAuth, OpenID, and CAS. In addition, through recent community contributions Apache Ambari, and Apache Ranger can now also provide SSO authentication through Knox. This paper will discuss the architecture of Knox SSO, it will explain how enterprise user could benefit by this feature and will present enterprise use cases for Knox SSO, and integration with open source Shibboleth, ADFS Windows server Idp support, and Okta cloud Idp.
Strategies for Landing an Oracle DBA Job as a Fresher
Apache Knox Gateway "Single Sign On" expands the reach of the Enterprise Users
1. Apache Knox Gateway “Single
Sign On” expands the reach of
the Enterprise users
Jeffrey E Rodriguez Viaña
Tanping Wang
June 2017
2. Who Am I?
• Jeffrey E Rodriguez
• Senior BigData Engineer/Tech
Security Leader
• Work @ IBM.
• Apache Hadoop/Knox contributor.
• Apache Xerces committer.
• https://www.linkedin.com/in/jeffrey
rodriguezinnovationperu/
3. Apache Knox Gateway is a proxy for interacting with Apache Hadoop
clusters in a secure way providing authentication, service level authorization,
and many other extensions to secure any HTTP interactions in your cluster.
One feature of Apache Knox Gateway is the ability to extend the reach of
your REST APIs to the internet while still securing your cluster and working
with Kerberos. Recent contributions to the Apache Knox community have
added support for Single Sign On (SSO) based on Pac4j 1.8.9 which is a very
powerful security engine which provides SSO support through SAML2,
OAuth, OpenID, and CAS. In addition, through recent community
contributions Apache Ambari, Apache Atlas and Apache Ranger can now
also provide SSO authentication through Knox. This presentation will
discuss the architecture of Knox SSO, it will explain how enterprise user could
benefit by this feature and will present enterprise use cases for Knox SSO,
and integration with open source Shibboleth, ADFS Windows server Idp
support, and Okta cloud Idp.
5. Single Sign On/Federation
• Knox “SSO” is not a Kerberos or Ldap replacement but an effective
way to distribute enterprise authentication resources.
• You don’t longer need to proliferate authentication resources (LDAP,
KDCs, etc) but you can put these resources behind Identity providers
such as Shibboleth, ADFS, WSO2, Okta.
• You can also do identity management through Idp services. This
means user identity lifecycle, credentials and authorization can be
manage in one single place.
6. • The Apache Knox Gateway is a system that provides a single point of
authentication and access for Apache™ Hadoop® services. It provides
the following features: Single REST API Access Point. Centralized
authentication, authorization and auditing for Hadoop REST/HTTP
services.
7. • An Identity provider is defined as "A kind of provider that creates,
maintains, and manages identity information for principals and
provides principal authentication to other service providers within a
federation, such as with web browser profiles
8. Knox Idps
1. Form-based identity Provider – Knox has a customizable form
application which leverages JWT. – AKA local SSO
• JWT, JSON Web Token – RFC7797.
• “JSON Web Token (JWT) is a compact, URL-safe means of representing claims
to be transferred between two parties. The claims in a JWT are encoded as a
JSON object that is used as the payload of a JSON Web Signature (JWS)
structure or as the plaintext of a JSON Web Encryption (JWE) structure,
enabling the claims to be digitally signed or integrity protected with a
Message Authentication Code (MAC) and/or encrypted.”
• SAML – based identity Provider
• This is set through the knoxsso.xml topology.
9. Knox Idps infrastructure
2. SAML-based Identity Provider (IdP)
• This leverages PAC4J lib to add support for SAML, CAS, Oauth, OpenId.
• Also requires changes to knoxsso.xml and default.xml topologies.
• At the time of this presentation there is no support for establishing groups
from the SAML assertion and the participating applications must use a group
lookup to establish group membership based on username.
10. Single Sign On Providers
• Case allows a web application builder like our demo KnoxExplorer (on a different
domain www.local.com), reach out Hadoop HDFS cluster data in a secure way and
process/transform/analyze such data.
• There are many commercial identity providers as a service available but
enterprises have many choices:
• Host their own SAML, Oauth, etc identity provider using IBM TFIM (IBM Tivoli
Federation Identity Manager) or Microsoft Active Directory Federated
Services (ADFS).
• Use a comercial web service as Okta cloud Idp.
• User IBM Bluemix clous SSO as a Service APIs.
• There are Ambari Single Sign On services such as
https://www.onelogin.com/connector/ambari-single-sign-on
11. Knox starting in Knox 0.8.0 has SSO support (
CAS/OAuth/OpenID/SAML) using pac4j
pac4j is a Java security engine to authenticate users, get their
profiles and manage their authorizations in order to secure Java
web applications.
It supports many authentication mechanisms for UI and web
services and is implemented by many frameworks and tools.
For Knox, it is used as a federation provider to support the OAuth,
CAS, SAML and OpenID Connect protocols. It must be used for
SSO, in association with the KnoxSSO service and optionally with
the SSOCookieProvider for access to REST APIs.
12. Knox SSO Providers/Services
• KnoxSSO Default Form-based IDP - The default configuration of KnoxSSO
provides a form-based authentication mechanism that leverages the Shiro
authentication
to authenticate against LDAP/AD with credentials collected from a form-
based challenge.
• Pac4J - The pac4j provider adds numerous authentication and federation
capabilities including: SAML, CAS, OpenID Connect, Google,Twitter, etc.
• HeaderPreAuth - A simple mechanism for propagating the identity through
HTTP Headers that specify the username and group for the
authenticated user. This has been built with vendor usecases such as
SiteMinder and IBM Tivoli Access Manager.
13. Knox SSO Providers/Services
• KnoxSSO - The KnoxSSO service is an integration service that provides
a normalized SSO token for representing the authenticated user.
This token is generally used for WebSSO capabilities for participating
UIs and their consumption of the Apache Hadoop REST APIs.
KnoxSSO abstracts the actual identity provider integration away from
participating applications so that they only need to
be aware of the KnoxSSO cookie. The token is presented by the
browser as a cookie and applications that are participating in
the KnoxSSO integration are able to cryptographically validate the
presented token and remain agnostic to the underlying
SSO integration.
14. SAML ( Almost other known SSO solution
follows similar pattern)
• Security Assertion Markup Language (XML based).
19. ADFS SSO solution for Windows
• You can configure FS services in AD (ActiveDirectory) so ADFS can
serve as and Idp server and support SAML.
20. Shibboleth IdP 3.x Service
• Shibboleth is a standards based, open source software package for
web single sign-on across or within organizational boundaries.
• Open source project providing Idp through SAML
• Supports SAM 2.0
• You can configure Shibboleth with FreeIPA.
• Shibboleth IOP V3
• Either build it from source or try it using docker image: “docker run -
it -v $(pwd):/ext-mount --rm unicon/shibboleth-idp init-idp.sh”
21. Conclusion
• Knox provides a secure SSL access to Hadoop REST APIs and UI’s.
• The support of SSO by Knox allows to manage Authentication in a
more efficient and manageable way by leveraging Identity Provider
services through SAML.
• You can either use a commercial SaaS Identity providers like Okta or
roll your own using your existing Enterprise middleware like ADFS or
even using Shibboleth Idp as an open source alternative.
• We will add a Shibboleth Idp Knox SSO demo and provide future blogs
on this integration through the Knox community.
• Knox Supports SSO for Ambari, Apache Ranger, and Apache Atlas.