Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API OIDC authentication mechanism #10905

Draft
wants to merge 62 commits into
base: develop
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
62 commits
Select commit Hold shift + click to select a range
81d0ee0
api oidc authentication mechanism
ErykKul Oct 3, 2024
96fab76
replaced tabs with spaces
ErykKul Oct 3, 2024
cbde18f
better error handling for not authenticated users
ErykKul Oct 3, 2024
b137bbc
Update docker-compose-dev.yml
ErykKul Oct 3, 2024
e94ebc4
changed the demo user to admin and better error when user is not foun…
ErykKul Oct 3, 2024
2a2c583
admin user email fix
ErykKul Oct 3, 2024
b7937e6
restored kaycloak config to original
ErykKul Oct 3, 2024
d1120f8
restored bearer token config
ErykKul Oct 3, 2024
eea06bd
removed unused import
ErykKul Oct 3, 2024
1a5cc5d
simplified config
ErykKul Oct 3, 2024
5c1dc24
improved implementation with exposed tokens, no unverified emails blo…
ErykKul Oct 4, 2024
575d653
only verified email can be used to log in
ErykKul Oct 4, 2024
eea3b5c
fixed email verified check
ErykKul Oct 4, 2024
0703a65
oidc JSF log in
ErykKul Oct 4, 2024
ff8bb52
bearer token and oidc provider refactoring to use the new payara mech…
ErykKul Oct 4, 2024
61703e8
fixed log in issues
ErykKul Oct 4, 2024
004613f
redirect to first log in page and user lookup by user record identifier
ErykKul Oct 5, 2024
d377505
Multi-tenancy implementation
ErykKul Oct 5, 2024
895e054
bearer token mechanism for OIDC
ErykKul Oct 5, 2024
ac43c94
OIDC token is no loger stored in DB after first log in
ErykKul Oct 5, 2024
baa02ea
python bearer token example
ErykKul Oct 5, 2024
1db8c10
added documentation
ErykKul Oct 7, 2024
e603365
doc fix
ErykKul Oct 7, 2024
0123ab2
added a release note
ErykKul Oct 7, 2024
b0190e5
doc fix
ErykKul Oct 7, 2024
058c17a
doc fix
ErykKul Oct 7, 2024
4e6e8e5
doc fix
ErykKul Oct 7, 2024
6a635bf
doc fix
ErykKul Oct 7, 2024
3cf9d4d
doc fix
ErykKul Oct 7, 2024
22da240
Update src/main/java/edu/harvard/iq/dataverse/authorization/providers…
ErykKul Oct 7, 2024
96bb495
Update src/main/java/edu/harvard/iq/dataverse/authorization/providers…
ErykKul Oct 7, 2024
5950c94
Update doc/sphinx-guides/source/installation/oidc.rst
ErykKul Oct 7, 2024
6ff8744
Update doc/sphinx-guides/source/installation/oidc.rst
ErykKul Oct 7, 2024
803618d
Update doc/release-notes/PR-10905-OIDC-new-implementation.md
ErykKul Oct 7, 2024
68da25a
removed run_dev_env.sh and added it in .gitignore to prevent commitin…
ErykKul Oct 7, 2024
d600c51
moved python example
ErykKul Oct 7, 2024
817c416
restored accidently deleted line
ErykKul Oct 7, 2024
45facde
@ejb -> @inject
ErykKul Oct 7, 2024
9a44380
toJson -> JsonPrinter.json
ErykKul Oct 7, 2024
ef0f0f8
change BearerTokenMechanism class to final class
ErykKul Oct 7, 2024
0658a93
added comment
ErykKul Oct 7, 2024
6148051
added comment
ErykKul Oct 7, 2024
ca5ee82
removed unused injection
ErykKul Oct 7, 2024
b56210b
This is Javagit add -A! (Spartan kick here)
ErykKul Oct 7, 2024
139c9fc
reverted token auto-refreshing to default value
ErykKul Oct 7, 2024
28d70aa
made the fact that nobody has access to the content behind the authen…
ErykKul Oct 7, 2024
1ffdf70
renamed required role: all -> nobodyHasAccess
ErykKul Oct 7, 2024
5057c61
SEVERE -> FINE
ErykKul Oct 7, 2024
fc5fb0f
simplified method a bit
ErykKul Oct 7, 2024
a30647f
removed nimbus dependency
ErykKul Oct 7, 2024
9a4a702
added comments in the code
ErykKul Oct 8, 2024
e1b75f9
updated release note
ErykKul Oct 8, 2024
b90bb31
PKCE client example
ErykKul Oct 8, 2024
56c7ade
removed unneeded newline
ErykKul Oct 8, 2024
c151b5c
added ; at the end of the line
ErykKul Oct 8, 2024
f219d79
double quotes to single quotes
ErykKul Oct 8, 2024
e39749e
fixed server restart problem
ErykKul Oct 9, 2024
be96092
reverted commit on a wrong branch
ErykKul Oct 9, 2024
bb70526
simplified implementation
ErykKul Oct 13, 2024
0264957
removed uneeded log
ErykKul Oct 13, 2024
0ec91dd
updated configuration
ErykKul Oct 13, 2024
11e7de7
merged develop
ErykKul Nov 18, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -63,3 +63,6 @@ src/main/webapp/resources/images/dataverseproject.png.thumb140
# Docker development volumes
/docker-dev-volumes
/.vs

# custom run script for developers
run_dev_env.sh
23 changes: 23 additions & 0 deletions doc/release-notes/PR-10905-OIDC-new-implementation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
New OpenID Connect implementation including new log in scenarios (see [the guides](https://dataverse-guide--10905.org.readthedocs.build/en/10905/installation/oidc.html#choosing-provisioned-providers-at-log-in)) for the current JSF frontend, the new Single Page Application (SPA) frontend, and a generic API usage. The API scenario using Bearer Token authorization is illustrated with a Python script that can be found in the `doc/sphinx-guides/_static/api/bearer-token-example` directory. This Python script prompts you to log in to the Keycloak in a new browser window using selenium. You can run that script with the following commands:

```shell
cd doc/sphinx-guides/_static/api/bearer-token-example
./run.sh
```

ErykKul marked this conversation as resolved.
Show resolved Hide resolved
This script is safe for production use, as it does not require you to know the client secret or the user credentials. Therefore, you can safely distribute it as a part of your own Python script that lets users run some custom tasks.
ErykKul marked this conversation as resolved.
Show resolved Hide resolved

The following settings become deprecated with this change and can be removed from the configuration:
- `dataverse.auth.oidc.pkce.enabled`
- `dataverse.auth.oidc.pkce.method`
- `dataverse.auth.oidc.pkce.max-cache-size`
- `dataverse.auth.oidc.pkce.max-cache-age`

The following settings new:
- `dataverse.auth.oidc.issuer-identifier`
- `dataverse.auth.oidc.issuer-identifier-field`
- `dataverse.auth.oidc.subject-identifier-field`

Also, the bearer token authentication is now always enabled. Therefore, the `dataverse.feature.api-bearer-auth` feature flag is no longer used and can be removed from the configuration as well.

The new implementation relies now on the builtin OIDC support in our application server (Payara). With this change the Nimbus SDK is no longer used and is removed from the dependencies.
28 changes: 28 additions & 0 deletions doc/sphinx-guides/_static/api/bearer-token-example/get_session.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
import contextlib
import selenium.webdriver as webdriver
import selenium.webdriver.support.ui as ui
import re
import json
import requests

with contextlib.closing(webdriver.Firefox()) as driver:
driver.get("http://localhost:8080/oidc/login?target=API&oidcp=oidc-mpconfig")
wait = ui.WebDriverWait(driver, 100) # timeout after 100 seconds
wait.until(lambda driver: "accessToken" in driver.page_source)
driver.get("view-source:http://localhost:8080/api/v1/oidc/session")
result = wait.until(
lambda driver: (
driver.page_source if "accessToken" in driver.page_source else False
)
)
m = re.search("<pre>(.+?)</pre>", result)
if m:
found = m.group(1)
session = json.loads(found)

token = session["data"]["accessToken"]
endpoint = "http://localhost:8080/api/v1/users/:me"
headers = {"Authorization": "Bearer " + token}

print()
print(requests.get(endpoint, headers=headers).json())
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
selenium
requests
7 changes: 7 additions & 0 deletions doc/sphinx-guides/_static/api/bearer-token-example/run.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
#!/bin/bash

python3 -m venv run_env
source run_env/bin/activate
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 [shellcheck] reported by reviewdog 🐶
Not following: run_env/bin/activate: openBinaryFile: does not exist (No such file or directory) SC1091

python3 -m pip install -r requirements.txt
python3 get_session.py
rm -rf run_env
22 changes: 22 additions & 0 deletions doc/sphinx-guides/_static/frontend/PKCE-example/PKCE-example.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
<!doctype html>
<html>

<body>
<script src="http://unpkg.com/[email protected]/dist/keycloak-authz.js"></script>
<script src="http://unpkg.com/[email protected]/dist/keycloak.js"></script>

<script>
const kc = new Keycloak({
url: 'http://keycloak.mydomain.com:8090',
realm: 'test',
clientId: 'test'
});
kc.init({
pkceMethod: 'S256',
redirectUri: 'http://localhost:8080/api/v1/users/:me'
});
kc.login();
</script>
</body>

</html>
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,6 @@
"factoryAlias":"oidc",
"title":"<a title - shown in UI>",
"subtitle":"<a subtitle - currently unused in UI>",
"factoryData":"type: oidc | issuer: <issuer url> | clientId: <client id> | clientSecret: <client secret> | pkceEnabled: <true/false> | pkceMethod: <PLAIN/S256/...>",
"factoryData":"type: oidc | issuer: <issuer url> | clientId: <client id> | clientSecret: <client secret> | issuerId: <issuer id> | issuerIdField: <issuer id field> | subjectIdField: <subject id field>",
"enabled":true
}
11 changes: 6 additions & 5 deletions doc/sphinx-guides/source/api/auth.rst
Original file line number Diff line number Diff line change
Expand Up @@ -69,17 +69,18 @@ You can reset your API Token from your account page in your Dataverse installati
Bearer Tokens
-------------

Bearer tokens are defined in `RFC 6750`_ and can be used as an alternative to API tokens if your installation has been set up to use them (see :ref:`bearer-token-auth` in the Installation Guide).
Bearer tokens are defined in `RFC 6750`_ and can be used as an alternative to API tokens if your installation has been set up to use OpenID Connect log in (see :ref:`oidc-log-in` in the Installation Guide).

.. _RFC 6750: https://tools.ietf.org/html/rfc6750

To test if bearer tokens are working, you can try something like the following (using the :ref:`User Information` API endpoint), substituting in parameters for your installation and user.
To test if bearer tokens are working, you can use a Python script that prompts you to log in to the Keycloak in a new browser window using selenium. For example, you can run the script inside the `doc/sphinx-guides/_static/api/bearer-token-example` that illustrates this:

.. code-block:: bash

export TOKEN=`curl -s -X POST --location "http://keycloak.mydomain.com:8090/realms/test/protocol/openid-connect/token" -H "Content-Type: application/x-www-form-urlencoded" -d "username=user&password=user&grant_type=password&client_id=test&client_secret=94XHrfNRwXsjqTqApRrwWmhDLDHpIYV8" | jq '.access_token' -r | tr -d "\n"`

curl -H "Authorization: Bearer $TOKEN" http://localhost:8080/api/users/:me
cd doc/sphinx-guides/_static/api/bearer-token-example
./run.sh

This script is safe for production use, as it does not require you to know the client secret or the user credentials. Therefore, you can safely distribute it as a part of your own Python script that lets users run some custom tasks.

Signed URLs
-----------
Expand Down
43 changes: 43 additions & 0 deletions doc/sphinx-guides/source/api/native-api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6486,3 +6486,46 @@ Parameters:

``per_page`` Number of results returned per page.

.. _oidc-session:

Session
-------

The Session API is used to get the information on the current OIDC session (after being successfully authenticated using the OpenID Connect :ref:`oidc-log-in`).
You can be either redirected to that endpoint using the `API` log in flow as illustrated in the :ref:`bearer-tokens` example, or going to this endpoint directly,
after logging-in in your browser. The returned JSON looks like this:

.. code-block:: json

{
"status": "OK",
"data": {
"user": {
"id": 3,
"userIdentifier": "aUser",
"lastName": "User",
"firstName": "Dataverse",
"email": "[email protected]",
"isSuperuser": false,
"createdTime": "2024-10-07 08:26:29.453",
"lastLoginTime": "2024-10-07 08:26:29.453",
"deactivated": false,
"mutedEmails": [],
"mutedNotifications": []
},
"session": "6164900bf35e7f576a92e4f771cc",
"accessToken": "eyJhbGc...7VvYOMYxreH-Uo3RpaA"
}
}

You can then use the retrieved `session` and `accessToken` for subsequent calls to the API or the session endpoint, as illustrated in the following curl examples:

.. code-block:: bash

export BEARER_TOKEN=eyJhbGc...7VvYOMYxreH-Uo3RpaA
export SESSION=6164900bf35e7f576a92e4f771cc
export SERVER_URL=https://demo.dataverse.org

curl -H "Authorization: Bearer $BEARER_TOKEN" "$SERVER_URL/api/oidc/session"

curl -v --cookie "JSESSIONID=$SESSION" "$SERVER_URL/api/oidc/session"
4 changes: 1 addition & 3 deletions doc/sphinx-guides/source/installation/config.rst
Original file line number Diff line number Diff line change
Expand Up @@ -758,12 +758,10 @@ As for the "Remote only" authentication mode, it means that:
Bearer Token Authentication
---------------------------

Bearer tokens are defined in `RFC 6750`_ and can be used as an alternative to API tokens. This is an experimental feature hidden behind a feature flag.
Bearer tokens are defined in `RFC 6750`_ and can be used as an alternative to API tokens.

.. _RFC 6750: https://tools.ietf.org/html/rfc6750

To enable bearer tokens, you must install and configure Keycloak (for now, see :ref:`oidc-dev` in the Developer Guide) and enable ``api-bearer-auth`` under :ref:`feature-flags`.

You can test that bearer tokens are working by following the example under :ref:`bearer-tokens` in the API Guide.

.. _smtp-config:
Expand Down
67 changes: 29 additions & 38 deletions doc/sphinx-guides/source/installation/oidc.rst
Original file line number Diff line number Diff line change
Expand Up @@ -69,26 +69,6 @@ After adding a provider, the Log In page will by default show the "builtin" prov
In contrast to our :doc:`oauth2`, you can use multiple providers by creating distinct configurations enabled by
the same technology and without modifying the Dataverse Software code base (standards for the win!).


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have been asked to implement PKCE support back in #8349. They claimed at the time that PKCE was required by Microsoft for all types of applications. Looking at https://learn.microsoft.com/en-us/entra/identity-platform/v2-oauth2-auth-code-flow#request-an-authorization-code, this seems not to be the case for a non-public client (which any Dataverse backend installation is).

So in theory, we should be able to remove it and hopefully not break anything for people out there (as long as they properly setup their stuff). We should probably discuss this at a tech hour to get more votes in.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am planning to do some research for it as I have a feeling that it might still work. I could not see a reason why it wouldn't, and if it does not, we could ask Payara devs to implement it.

.. _oidc-pkce:

Enabling PKCE Security
^^^^^^^^^^^^^^^^^^^^^^

Many providers these days support or even require the usage of `PKCE <https://oauth.net/2/pkce/>`_ to safeguard against
some attacks and enable public clients that cannot have a secure secret to still use OpenID Connect (or OAuth2).

The Dataverse-built OIDC client can be configured to use PKCE and the method to use when creating the code challenge can be specified.
See also `this explanation of the flow <https://auth0.com/docs/get-started/authentication-and-authorization-flow/authorization-code-flow-with-proof-key-for-code-exchange-pkce>`_
for details on how this works.

As we are using the `Nimbus SDK <https://connect2id.com/products/nimbus-oauth-openid-connect-sdk>`_ as our client
ErykKul marked this conversation as resolved.
Show resolved Hide resolved
library, we support the standard ``PLAIN`` and ``S256`` (SHA-256) code challenge methods. "SHA-256 method" is the default
as recommend in `RFC7636 <https://datatracker.ietf.org/doc/html/rfc7636#section-4.2>`_. If your provider needs some
other method, please open an issue.

The provisioning sections below contain in the example the parameters you may use to configure PKCE.

Provision a Provider
--------------------

Expand All @@ -106,9 +86,6 @@ requires fewer extra steps and allows you to keep more configuration in a single
Provision via REST API
^^^^^^^^^^^^^^^^^^^^^^

Note: you may omit the PKCE related settings from ``factoryData`` below if you don't plan on using PKCE - default is
disabled.

Please create a :download:`my-oidc-provider.json <../_static/installation/files/root/auth-providers/oidc.json>` file, replacing every ``<...>`` with your values:

.. literalinclude:: /_static/installation/files/root/auth-providers/oidc.json
Expand Down Expand Up @@ -163,14 +140,6 @@ The following options are available:
- The base URL of the OpenID Connect (OIDC) server as explained above.
- Y
- \-
* - ``dataverse.auth.oidc.pkce.enabled``
ErykKul marked this conversation as resolved.
Show resolved Hide resolved
- Set to ``true`` to enable :ref:`PKCE <oidc-pkce>` in auth flow.
- N
- ``false``
* - ``dataverse.auth.oidc.pkce.method``
- Set code challenge method. The default value is the current best practice in the literature.
- N
- ``S256``
* - ``dataverse.auth.oidc.title``
- The UI visible name for this provider in login options.
- N
Expand All @@ -179,12 +148,34 @@ The following options are available:
- A subtitle, currently not displayed by the UI.
- N
- ``OpenID Connect``
* - ``dataverse.auth.oidc.pkce.max-cache-size``
- Tune the maximum size of all OIDC providers' verifier cache (the number of outstanding PKCE-enabled auth responses).
* - ``dataverse.auth.oidc.issuer-identifier``
- Issuer identifier value as found in the JWT token claims under ``dataverse.auth.oidc.issuer-identifier-field``.
- N
- 10000
* - ``dataverse.auth.oidc.pkce.max-cache-age``
- Tune the maximum age, in seconds, of all OIDC providers' verifier cache entries. Default is 5 minutes, equivalent to lifetime
of many OIDC access tokens.
- ``value from dataverse.auth.oidc.auth-server-url``
* - ``dataverse.auth.oidc.issuer-identifier-field``
- Issuer identifier field name in the JWT token claims.
- N
- 300
- ``iss``
* - ``dataverse.auth.oidc.subject-identifier-field``
- Subject identifier field name in the JWT token claims.
- N
- ``sub``

.. _oidc-log-in:

Choosing Provisioned Providers at Log In
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

In the JSF frontend, you can select the provider you wish to log in with at login time. However, you can also use the login link directly, for example, from a Python script as illustrated in the `doc/sphinx-guides/_static/api/bearer-token-example` :ref:`bearer-tokens` (you can copy that link in the
browser, it will prompt you with the Keycloak and redirect you to the API endpoint for retrieving the session :ref:`oidc-session`):
http://localhost:8080/oidc/login?target=API&oidcp=oidc-mpconfig

The `oidc` parameter is the provisioned provider ID you wish to use and is configured in the previous steps. For example,
`oidc-mpconfig` is the provider configured with the JVM Options, it is also the default provider if this parameter is not included
in the request. The target parameter is the name of the target you want to be redirected to after a successful logging in. First you are
redirected to the callback endpoint of the OpenID Connect flow (`/oidc/callback/*`) which on its turn redirects you to the location
chosen in the target parameter:

- `JSF` is the default target, and it redirects you to the JSF frontend
- `API` redirects you to the session endpoint of the native API :ref:`oidc-session`, from which you can recover the session ID and the bearer token for the API access
- `SPA` redirects you to the new SPA, if it is already installed on your system
1 change: 0 additions & 1 deletion docker-compose-dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,6 @@ services:
ENABLE_RELOAD: "1"
ErykKul marked this conversation as resolved.
Show resolved Hide resolved
SKIP_DEPLOY: "${SKIP_DEPLOY}"
DATAVERSE_JSF_REFRESH_PERIOD: "1"
DATAVERSE_FEATURE_API_BEARER_AUTH: "1"
ErykKul marked this conversation as resolved.
Show resolved Hide resolved
DATAVERSE_MAIL_SYSTEM_EMAIL: "dataverse@localhost"
DATAVERSE_MAIL_MTA_HOST: "smtp"
DATAVERSE_AUTH_OIDC_ENABLED: "1"
Expand Down
1 change: 0 additions & 1 deletion docker/compose/demo/compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@ services:
DATAVERSE_DB_HOST: postgres
DATAVERSE_DB_PASSWORD: secret
DATAVERSE_DB_USER: dataverse
DATAVERSE_FEATURE_API_BEARER_AUTH: "1"
DATAVERSE_MAIL_SYSTEM_EMAIL: "Demo Dataverse <[email protected]>"
DATAVERSE_MAIL_MTA_HOST: "smtp"
JVM_ARGS: -Ddataverse.files.storage-driver-id=file1
Expand Down
6 changes: 0 additions & 6 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -462,12 +462,6 @@
<artifactId>scribejava-apis</artifactId>
<version>6.9.0</version>
</dependency>
<!-- OpenID Connect authentication -->
<dependency>
<groupId>com.nimbusds</groupId>
<artifactId>oauth2-oidc-sdk</artifactId>
<version>10.13.2</version>
</dependency>
<!-- Caching library, current main use case is for OIDC authentication -->
<dependency>
<groupId>com.github.ben-manes.caffeine</groupId>
Expand Down
Loading
Loading