Skip to content

Adding access token refresh mechanism#92

Open
alexbourret wants to merge 5 commits intomasterfrom
feature/access-token-refresh
Open

Adding access token refresh mechanism#92
alexbourret wants to merge 5 commits intomasterfrom
feature/access-token-refresh

Conversation

@alexbourret
Copy link
Copy Markdown
Collaborator

@alexbourret alexbourret commented Apr 8, 2026

Copy link
Copy Markdown
Contributor

@liamlynch-data liamlynch-data left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a look at the code - I've noticed a significant problem and some areas for improvement. Probably best to address these before I do a test.

Also some info on the best testing strategies would be good.
Specifically

  • how to do/fake token expiry
  • what we need to do to invoke the two paths in __init__() for AUTH_OAUTH method, I'm still not quite sure when each is invoked

Also, to be clear on scope, which methods should it work for? It says in the changelog:
Add access token refreshing for Certificates and App username password
But I think for those they only get an access token not refresh method, so there's no way to refresh the token is there? I can't see how it would work. 🤔

And I was unclear about what the following means in the SC card - will some of this not work without DSS changes? Do I need to understand this?

_- Since the token is retrieved by the plugin for ** and ** preset, it is possible to refresh it in time.

  • The option of using a refreshed token by the plugin will be offered with DSS 14.5, this could be implemented at the same time_

self.apply_paths_overwrite(config)
self.setup_sharepoint_online_url(login_details)
self.sharepoint_access_token = login_details['sharepoint_oauth']
if "__credentials" in login_details:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✍️ Could you put a comment here to explain what these two cases are here and when each one will happen? Also maybe mention where __credentials is populated and what kind of structure should be in there.

if "__credentials" in login_details:
logger.info("Refreshable access token")
from dataiku.core import plugin
access_token = plugin.OAuthCredentials(login_details.get("__credentials", {}).get("sharepoint_oauth"))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ access_token isn't an access token, is it? It's more a python object. In fact it functions as a access_token_getter as you use elsewhere. Why not call it that 🤔 ?

Comment on lines +12 to +14
if isinstance(token_refresh_method, str):
logger.info("No refresh method available")
self.current_token = token_refresh_method
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So here you pass something in called 'token_refresh_method' then check the type only to then say "No refresh method available". Which implies the name of token_refresh_method was a bit of a lie.... I feel it is a bit illogical/confusing...

There might be a clearer way - e.g. two separate parameters with names maybe

@@ -59,12 +60,19 @@ def __init__(self, config, root_name_overwrite_legacy_mode=False):
self.apply_paths_overwrite(config)
self.setup_sharepoint_online_url(login_details)
self.sharepoint_access_token = login_details['sharepoint_oauth']
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❓ What are the possibilities for sharepoint_access_token ?
Is it in fact always a string that is an access token? Or could it be something else?
FreshToken allows a couple of options so this is not quite clear. (If it ca be something other than a token it would need a different name)

Also, why is this variable still associated with the client self ? It is only used once, there's no need to keep a reference to it any more. The token is actually managed in the sharepoint session, so this is kind of a "red-herring" now.

def refresh_token(self):
self.current_token = self.token_refresh_method()
decoded_jwt = decode_jwt(self.current_token)
self.token_validity = decoded_jwt.get("exp", None)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

token_validity is a confusing name for this to me. Validity implies the property of being valid or not, not the time when something expires. Also None seems to imply valid - anti-intuitive maybe if this is mean to be "validity" an abstract way?
Maybe just call it token_renewal_time ? This is the time after which we will renew it I think.

Comment on lines +29 to +31
if (epoch_time_now > self.token_validity):
return False
return True
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ No need for two return statements really, could just return the boolean self.token_validity <= epoch_time_now

def _default_refresh_method(self):
return self.current_token

def is_token_still_valid(self):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ 🤔 is_token_still_valid is not strictly true. You could switch the logic and call it token_needs_renewal and then it would be more accurate, make slightly more sense

if len(sub_tokens) < 2:
logger.error("JWT format is wrong")
return {}
token_of_interest = sub_tokens[1]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ sub_tokens[1] is the encoded token payload - probably we should call it that token_payload something

logger.error("JWT format is wrong")
return {}
token_of_interest = sub_tokens[1]
padded_token = token_of_interest + "="*divmod(len(token_of_interest), 4)[1]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ looks like an error here - we want the remaining number of padding symbols needed to get it up to a multiple of 4, not the remainder from a division (which will count back down to a multiple of 4)

So it would be something like:
padded_token = token_of_interest + "=" *((4 - divmod(len(token_of_interest), 4)[1]) % 4)

But there is a much less fugly way as with a negative argument, it effectively goes to the higher magnitude and the remainder counts back. Also you can just use %. So this should work:

padded_token = token_of_interest + "=" * (-len(token_of_interest) % 4)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also - I think this needs a unit test, this is not the sort of thing we can test manually 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants