feat: gdrive export and encryption service integration#5250
Conversation
… DB schema - Add user_oauth_token table to store encrypted OAuth refresh tokens per provider - Add TokenEncryptionService using jose4j AES-256-GCM for encrypting auth blobs - Add AuthConfig.encryptionSecretKey reading from auth.encryption.256-bit-secret - Add GoogleDriveAuthResource with /connect, /callback, and /token endpoints - Add GoogleAuthResource config endpoint exposing client ID and redirect URI - Add DriveTokenIssueResponse and GoogleAuthConfigResponse HTTP models - Wire GoogleDriveAuthResource into TexeraWebApplication and GuestAuthFilter - Add google.client-id, client-secret, and app-domain to UserSystemConfig - Update k8s values with new config keys Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…nd error case Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #5250 +/- ##
============================================
- Coverage 49.62% 49.40% -0.23%
+ Complexity 2384 2381 -3
============================================
Files 1051 1051
Lines 40399 40399
Branches 4292 4292
============================================
- Hits 20050 19960 -90
- Misses 19165 19260 +95
+ Partials 1184 1179 -5
*This pull request uses carry forward flags. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
xuang7
left a comment
There was a problem hiding this comment.
Thanks for the PR! Left a few comments. Please follow the formatting instructions in the
contributing guide and fix the formatting issues.
| @QueryParam("reauth") @DefaultValue("false") reauth: Boolean | ||
| ): Response = { | ||
| val user = sessionUser.getUser | ||
| val state = JwtAuth.jwtToken(jwtClaims(user, TOKEN_EXPIRE_TIME_IN_MINUTES)) |
There was a problem hiding this comment.
We should avoid using the normal session JWT as the OAuth state. Since it is still a valid login token before expiration, it may be safer to use a dedicated short-lived OAuth state token instead.
|
|
||
| try { | ||
| val blob = mapper.readTree(TokenEncryptionService.decrypt(record.getAuthBlob)) | ||
| val refreshToken = blob.get("refreshToken").asText() |
There was a problem hiding this comment.
Suggest using path("refreshToken").asText("") here to avoid a possible NPE when the field is missing.
There was a problem hiding this comment.
@xuang7 I can add this, but seeing as this is wrapped in a try-catch, I feel like the error is fine/more defined, compared to getting "", sending a request to google and getting an error there.
There was a problem hiding this comment.
That makes sense. I agree we should not silently send an empty refresh token to Google. Maybe we can use path("refreshToken").asText("") to avoid the possible NPE, then explicitly check whether the token is empty and return no_refresh_token locally before making the Google request.
…ogleDriveAuthResource OAuth state is now a UUID stored in a ConcurrentHashMap with a 10-minute TTL, consumed exactly once on callback. Removes JwtParser/JwtAuth dependency from the Drive resource and avoids encoding user info in the callback URL. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Removed random secret key for eSecretKey
Added default asText("") to avoid NPE
…_token - Add DELETE /api/auth/google/drive/disconnect to remove stored OAuth token - Add created_at and updated_at columns to user_oauth_token table - Set updated_at on token refresh in callback Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
| .filter(r => r.getProvider == PROVIDER_GOOGLE_DRIVE) | ||
| .findFirst() | ||
|
|
||
| if (existing.isPresent) { |
There was a problem hiding this comment.
Disconnect currently only removes the local DB row, but it does not revoke the grant on Google’s side. Would it make sense to also revoke the token with Google before deleting the local row? That would better match the expected “Disconnect” behavior.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
What changes were proposed in this PR?
Adds the backend required for Google Drive OAuth integration.
Schema changes: Adds a new
user_oauth_tokentable (sql/updates/23.sql) to store encrypted OAuth tokens per provider. Theprovidercolumn (google_drive, etc.) is intentionally generic so future integrations (AWS, Microsoft) can reuse the same table without a schema change. The auth blob is stored as a JWE-encrypted JSON string rather than a raw token.Token encryption: Adds
TokenEncryptionServiceusing jose4j AES-256-GCM (DIRECTkey management) to encrypt auth blobs at rest. The encryption key is read fromauth.encryption.256-bit-secretinauth.conf, withAUTH_ENCRYPTION_SECRETas the env-var override. This follows the same pattern as the existing JWT secret key.New endpoints —
GoogleDriveAuthResource:GET /api/auth/google/drive/connect— Returns a Google OAuth authorization URL for the frontend to open in a popup. Accepts areauthquery param; whentrue, setsprompt=consentto force Google to re-issue a refresh token (used when a previous token has returnedinvalid_grant). RequiresREGULARorADMINrole.GET /api/auth/google/drive/callback— Called by Google's OAuth redirect. Not role-gated (noAuthorizationheader is present on a browser redirect). Authenticates the user via a short-lived JWT in thestatequery parameter, exchanges thecodefor tokens, encrypts the auth blob, and upserts intouser_oauth_token.GET /api/auth/google/drive/token— Decrypts the stored auth blob, uses the refresh token to fetch a short-lived access token from Google, and returns it to the frontend. Returnsno_refresh_tokenif no record exists, orinvalid_grantif Google rejects the refresh token. RequiresREGULARorADMINrole.GET /api/auth/google/config— ExposesclientIdandredirectUrito the frontend so the Drive service doesn't need to hardcode them.Config: Adds
google.client-id,google.client-secret, andapp-domaintoUserSystemConfiganduser-system.conf. These must be configured on the Texera GCP project before Drive integration will work.Any related issues, documentation, discussions?
Closes #4240 (partial — frontend in follow-up PRs)
Google Documentation to enable Google Picker: https://developers.google.com/workspace/drive/picker/guides/overview
How was this PR tested?
sbt "Auth/testOnly org.apache.texera.auth.TokenEncryptionServiceSpec"— 2 unit tests covering encrypt/decrypt round-trip and invalid-input error casesbt amber/compile/callbackendpoint was tested manually via the full OAuth flow in a local dev environmentWas this PR authored or co-authored using generative AI tooling?
Commit messages and some implementation co-authored with Claude Sonnet 4.6