-
Notifications
You must be signed in to change notification settings - Fork 494
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle unregistered users in BearerTokenAuthMechanism and implement user registration mechanism #10972
base: develop
Are you sure you want to change the base?
Conversation
… is validated but there is no registered user account
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
…nAuthMechanism to AuthenticationServiceBean
…earer token auth feature flags compatibility
… logic paths depending on the value to RegisterOIDCUserCommand
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
…ed in docker-compose-dev
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
2 similar comments
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall - lots of changes, lots of tests! I made a few suggestions for changes and this probably needs a release note.
); | ||
registerOidcUserResponse.then().assertThat() | ||
.statusCode(UNAUTHORIZED.getStatusCode()) | ||
.body("message", equalTo("Unauthorized bearer token.")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general we try to avoid string comparisons so that tests don't break when messages change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, the error messages are meaningful for the test assertions, as depending on the point where the flow fails, one message or another will be returned, even though the HTTP error code is the same (401) (See the different AuthorizationExceptions that are thrown in
dataverse/src/main/java/edu/harvard/iq/dataverse/authorization/AuthenticationServiceBean.java
Line 1001 in 73c4079
public OIDCUserInfo verifyOIDCBearerTokenAndGetUserIdentifier(String bearerToken) throws AuthorizationException { |
As far as I know, Dataverse does not implement custom response error codes, so I had to rely on the response error messages for this test. I understand the concern about string comparisons and am open to revisiting this approach if alternative mechanisms (such as custom response error codes) become available in the future.
|
||
curl -H "Authorization: Bearer $TOKEN" -X POST http://localhost:8080/api/users/register --data '{"termsAccepted":true}' | ||
|
||
It is essential to send a JSON that includes the property ``termsAccepted`` set to true, which indicates that you accept the terms of service of Dataverse. Otherwise, you will not be able to create an account. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What terms? Is it https://guides.dataverse.org/en/latest/api/native-api.html#get-api-terms-of-use-url - if so can that be referenced here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, I am referring to the terms of use of the installation, the ones present in the sign-up form and are configurable in the installation.
I have rephrased the text to: "It is essential to send a JSON that includes the property termsAccepted
set to true, which indicates that you accept the Terms of Use of the installation. Otherwise, you will not be able to create an account."
I don't think I can provide any links to the terms as they are embedded and not present by default.
|
||
It is essential to send a JSON that includes the property ``termsAccepted`` set to true, which indicates that you accept the terms of service of Dataverse. Otherwise, you will not be able to create an account. | ||
|
||
In this JSON, we can also include the fields ``position`` or ``affiliation``, in the same way as when we register a user through the Dataverse UI. These fields are optional, and if not provided, they will be persisted as empty in Dataverse. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these never available from the token/user info? Should they be covered by the api-bearer-auth-json-claims flag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are not available, at least by default.
These are not standard OIDC claims, as these properties are specific to the Dataverse domain. They cannot be found in the PersonClaims class of the Nimbus SDK neither.
In my opinion, these properties should not be linked to the api-bearer-auth-json-claims
flag, as they represent Dataverse-specific properties that should be provided during sign-up (if desired, since they are optional).
- ``lastName`` | ||
- ``emailAddress`` | ||
|
||
Note that even if they are included in the JSON, if it is possible to retrieve the corresponding claims from the identity provider, these values will be ignored and the ones from the identity provider will be used instead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understood, in the comments, you suggested that ignoring the values could be confusing. Would it be better to just fail if you send info that is already in the token? (+1 from me)
@@ -3343,6 +3343,12 @@ please find all known feature flags below. Any of these flags can be activated u | |||
* - api-session-auth | |||
- Enables API authentication via session cookie (JSESSIONID). **Caution: Enabling this feature flag exposes the installation to CSRF risks!** We expect this feature flag to be temporary (only used by frontend developers, see `#9063 <https://github.com/IQSS/dataverse/issues/9063>`_) and for the feature to be removed in the future. | |||
- ``Off`` | |||
* - api-bearer-auth |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW: I originally thought that there would be a separate api-bearer-registration flag rather than overloading this flag. Is it always true that if people allow bearer tokens that also want people to be able to create new accounts this way? I'm not sure, but unless someone thinks this use case is real, I'm OK with it as is with one flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a separate flag. I'm not sure if perhaps you have missed it. There are two flags: api-bearer-auth
and api-bearer-auth-provide-missing-claims
(the one used in registration when the provider's claims are insufficient). I added both to the docs, as api-bearer-auth
was missing from the Feature Flags section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. I meant 3 - allow authorizing by token, allow reg by token, allow reg by token with user supplied info. At QDR (a fork) we create accounts via a different mechanism tied to our single sign on mechanism and will eventually want api access via token but not registration via token. As I said above, I don't know that there's a real non-fork use case where other people will want this, so I think it is OK to stay with just the 2 flags you have. We can make a change in the future if needed.
for (OIDCAuthProvider provider : providers) { | ||
try { | ||
// Retrieve both user identifier and user info | ||
Optional<UserRecordIdentifier> userRecordIdentifier = provider.getUserIdentifier(accessToken); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getUserIdentifier calls getUserInfo at (
Line 336 in bbf29cc
Optional<UserInfo> userInfo = getUserInfo(accessToken); |
FWIW: I was concerned that I didn't see the access token being validated anywhere, but since the callback to the userinfo has to succeed for either the getUserIdentifier or getUserInfo to succeed, we know that the token is valid.
userDTO.setEmailAddress(getValueOrDefault(userClaimsInfo.getEmailAddress(), userDTO.getEmailAddress())); | ||
} else { | ||
// Always use the claims provided by the OIDC provider, regardless of whether they are null or not | ||
userDTO.setUsername(userClaimsInfo.getPreferredUsername()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per the comment earlier - should we fail here if you send JSON with these values/with conflicting values?
// If either is present, return the result | ||
if (userRecordIdentifier.isPresent() || userInfo.isPresent()) { | ||
logger.log(Level.FINE, "Bearer token detected, provider {0} confirmed validity and provided user info", provider.getId()); | ||
return new OIDCUserInfo(userRecordIdentifier.get(), userInfo.get()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you return the OAuth2UserRecord from getUserRecord as suggested above, do you have all the info you need? Or could you add the missing fields to it? Versus creating a new aggregator class?
This comment has been minimized.
This comment has been minimized.
1 similar comment
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
📦 Pushed preview images as
🚢 See on GHCR. Use by referencing with full name as printed above, mind the registry name. |
What this PR does / why we need it:
This PR includes a mechanism for registering new users in Dataverse through a Bearer Token.
BearerTokenAuthMechanism
The Bearer Token authentication functionality (available through feature flag) has been extended to properly handle cases where BearerTokenAuthMechanism successfully validates the token but cannot identify any Dataverse user because there is no account associated with the token. See the current development version of BearerTokenAuthMechanism. (https://github.com/IQSS/dataverse/blob/develop/src/main/java/edu/harvard/iq/dataverse/api/auth/BearerTokenAuthMechanism.java#L58).
The BearerTokenAuthMechanism has been extended to return a meaningful error in the scenario described above. Specifically, the calling user would receive a 403 error, indicating that the token is validated but the user is not registered in Dataverse.
Additionally, the entire BearerTokenAuthMechanism class has been refactored, moving authentication and token verification logic to AuthService, since this logic is also utilized by the new user registration functionality included in this PR, and it also results in a simpler version of BearerTokenAuthMechanism, more testable and with more limited responsability.
User registration endpoint
The new endpoint takes the bearer token and a UserDTO object with the required properties to create a user account in Dataverse. It only works if the bearer token feature flag is enabled. A RegisterOidcUserCommand handles the validation of the bearer token and user data, and the account creation process. It can return various errors, including a new InvalidFieldsCommandException, which lists any validation errors in the UserDTO. This exception is now handled by AbstractApiBean like other command exceptions and can be used for future purposes.
Which issue(s) this PR closes:
Special notes for your reviewer:
See Handle unregistered users in BearerTokenAuthMechanism and implement user registration mechanism #10972 (comment) to understand why the OIDC UserInfo endpoint is not used in this design
The IT for the new registration endpoint is disabled because it depends on the containerized environment, which is not available in Jenkins. Additionally, the feature flag would not be active, so it would fail for this reason as well. Changes have been made to the Keycloak realm JSON to suit the needs of this IT, particularly the modification of roles in the test realm to allow an admin user to create a random test user in the IT setup.
Suggestions on how to test this:
Visual inspection of the API integration tests. Overall, this PR is extensively tested with unit and integration tests, but it is in the ITs where we can see how the endpoint calls work and the different responses that can be obtained.
We can also run this branch locally and reproduce the same calls made in the ITs using curl.
Does this PR introduce a user interface change? If mockups are available, please link/include them here:
No
Is there a release notes update needed for this change?:
I don't think so. Since this is still under a feature flag.
Additional documentation:
None