Adding access token refresh mechanism#92
Conversation
liamlynch-data
left a comment
There was a problem hiding this comment.
Had a look at the code - I've noticed a significant problem and some areas for improvement. Probably best to address these before I do a test.
Also some info on the best testing strategies would be good.
Specifically
- how to do/fake token expiry
- what we need to do to invoke the two paths in
__init__()for AUTH_OAUTH method, I'm still not quite sure when each is invoked
Also, to be clear on scope, which methods should it work for? It says in the changelog:
Add access token refreshing for Certificates and App username password
But I think for those they only get an access token not refresh method, so there's no way to refresh the token is there? I can't see how it would work. 🤔
And I was unclear about what the following means in the SC card - will some of this not work without DSS changes? Do I need to understand this?
_- Since the token is retrieved by the plugin for ** and ** preset, it is possible to refresh it in time.
- The option of using a refreshed token by the plugin will be offered with DSS 14.5, this could be implemented at the same time_
| self.apply_paths_overwrite(config) | ||
| self.setup_sharepoint_online_url(login_details) | ||
| self.sharepoint_access_token = login_details['sharepoint_oauth'] | ||
| if "__credentials" in login_details: |
There was a problem hiding this comment.
✍️ Could you put a comment here to explain what these two cases are here and when each one will happen? Also maybe mention where __credentials is populated and what kind of structure should be in there.
| if "__credentials" in login_details: | ||
| logger.info("Refreshable access token") | ||
| from dataiku.core import plugin | ||
| access_token = plugin.OAuthCredentials(login_details.get("__credentials", {}).get("sharepoint_oauth")) |
There was a problem hiding this comment.
⛏️ access_token isn't an access token, is it? It's more a python object. In fact it functions as a access_token_getter as you use elsewhere. Why not call it that 🤔 ?
| if isinstance(token_refresh_method, str): | ||
| logger.info("No refresh method available") | ||
| self.current_token = token_refresh_method |
There was a problem hiding this comment.
So here you pass something in called 'token_refresh_method' then check the type only to then say "No refresh method available". Which implies the name of token_refresh_method was a bit of a lie.... I feel it is a bit illogical/confusing...
There might be a clearer way - e.g. two separate parameters with names maybe
| @@ -59,12 +60,19 @@ def __init__(self, config, root_name_overwrite_legacy_mode=False): | |||
| self.apply_paths_overwrite(config) | |||
| self.setup_sharepoint_online_url(login_details) | |||
| self.sharepoint_access_token = login_details['sharepoint_oauth'] | |||
There was a problem hiding this comment.
❓ What are the possibilities for sharepoint_access_token ?
Is it in fact always a string that is an access token? Or could it be something else?
FreshToken allows a couple of options so this is not quite clear. (If it ca be something other than a token it would need a different name)
Also, why is this variable still associated with the client self ? It is only used once, there's no need to keep a reference to it any more. The token is actually managed in the sharepoint session, so this is kind of a "red-herring" now.
| def refresh_token(self): | ||
| self.current_token = self.token_refresh_method() | ||
| decoded_jwt = decode_jwt(self.current_token) | ||
| self.token_validity = decoded_jwt.get("exp", None) |
There was a problem hiding this comment.
token_validity is a confusing name for this to me. Validity implies the property of being valid or not, not the time when something expires. Also None seems to imply valid - anti-intuitive maybe if this is mean to be "validity" an abstract way?
Maybe just call it token_renewal_time ? This is the time after which we will renew it I think.
| if (epoch_time_now > self.token_validity): | ||
| return False | ||
| return True |
There was a problem hiding this comment.
⛏️ No need for two return statements really, could just return the boolean self.token_validity <= epoch_time_now
| def _default_refresh_method(self): | ||
| return self.current_token | ||
|
|
||
| def is_token_still_valid(self): |
There was a problem hiding this comment.
⛏️ 🤔 is_token_still_valid is not strictly true. You could switch the logic and call it token_needs_renewal and then it would be more accurate, make slightly more sense
| if len(sub_tokens) < 2: | ||
| logger.error("JWT format is wrong") | ||
| return {} | ||
| token_of_interest = sub_tokens[1] |
There was a problem hiding this comment.
⛏️ sub_tokens[1] is the encoded token payload - probably we should call it that token_payload something
| logger.error("JWT format is wrong") | ||
| return {} | ||
| token_of_interest = sub_tokens[1] | ||
| padded_token = token_of_interest + "="*divmod(len(token_of_interest), 4)[1] |
There was a problem hiding this comment.
So it would be something like:
padded_token = token_of_interest + "=" *((4 - divmod(len(token_of_interest), 4)[1]) % 4)
But there is a much less fugly way as with a negative argument, it effectively goes to the higher magnitude and the remainder counts back. Also you can just use %. So this should work:
padded_token = token_of_interest + "=" * (-len(token_of_interest) % 4)
There was a problem hiding this comment.
Also - I think this needs a unit test, this is not the sort of thing we can test manually 🤔
https://app.shortcut.com/dataiku/story/311249