flightcheck: add WD-ENV-101 Workday ISU username vs Entra UPN format alignment#87
Merged
GrahamMcMynn merged 9 commits intoMay 21, 2026
Conversation
GrahamMcMynn
commented
May 15, 2026
GrahamMcMynn
left a comment
Collaborator
Author
There was a problem hiding this comment.
Code review by ESS Code Review Agent
GrahamMcMynn
added a commit
that referenced
this pull request
May 15, 2026
Addresses code-review feedback on PR #87: the previous control flow required unner.graph to be available before reading the ISU env var, so a tenant where Graph auth had failed but Dataverse worked would silently SKIP the WD-ENV-101 check and miss the BCBSA-style no-@ legacy-format detection which is the most decisive signal this checkpoint produces and needs only the Dataverse value. Reordered: 1. Read ISU env var from Dataverse FIRST. 2. If no @ WARNING (regardless of Graph availability). 3. Only if ISU is in <x>@<domain> shape, require Graph for the verified-domain comparison; SKIP with a partial-result message if Graph isn't available. 4. Existing Graph error/empty-org/empty-verifiedDomains paths unchanged. Added regression test est_warns_on_legacy_isu_format_even_when_no_graph_client that pins this behaviour and renamed the existing est_skipped_when_no_graph_client to est_skipped_when_no_graph_client_and_isu_in_upn_format to clarify it only covers the UPN-shaped path. Both tests now register the Dataverse mock since the ISU read happens before the Graph guard. All 29 flightcheck tests pass; full test suite (113 passed, 8 skipped) still green. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
GrahamMcMynn
commented
May 15, 2026
…t alignment
Adds a new `_check_isu_username_format` to `checks/workday.py`
(checkpoint WD-ENV-101, Workday category, High priority) that pulls
`EmployeeContextRequestAccountName` from Dataverse and compares its
shape against the tenant's verified Entra domains read from Microsoft
Graph (`/v1.0/organization` -> `verifiedDomains[].name`). Federated
tenants (Okta, Ping, ADFS) frequently leave the ISU username in a
legacy short-employee-id format that doesn't match the UPN claim ESS
sends on each request, which prevents Workday from matching the
incoming request to a Worker - a common misconfiguration on tenants
that provisioned the ISU before adopting UPN-shaped service-account
naming.
Heuristics:
* No `@` in the ISU username -> WARNING (legacy short-id format -
the most-cited misconfiguration root cause).
* `@` present but the domain part isn't in the tenant's verified
domains -> WARNING (could be legitimate cross-tenant federation;
surface for the operator to confirm).
* `<localpart>@<verified-domain>` -> PASSED.
Tests at `tests/flightcheck/checks/test_workday_isu_username_format.py`
cover the GOOD, BAD (no-`@` legacy short-ID scenario),
unverified-domain, empty-verifiedDomains, empty-org, Graph-raises,
and three SKIPPED paths (no Dataverse token / no Graph client / ISU
env var unset). All 10 new tests pass; the existing 18 flightcheck
tests still pass unchanged. Mock tier: dataverse=`documented`,
graph=`validatable` - both permitted FlightCheck tiers per
`tests/fixtures/cassettes/INDEX.md` API tier registry. No new mock
modules were introduced, so no registry update required.
The deeper `Get_Workers User_ID == Entra UPN` per-Worker comparison
suggested in #82 is intentionally deferred:
1. The Workday SOAP mock module (`tests/mocks/workday.py`) is
`MOCK_STATUS = "placeholder"` and would need promotion to
`validated` against `flightcheck_workday.yaml` before any test
could exercise that surface.
2. "A sample of expected ESS users" requires a curated mapping
from Workday Employee_IDs to Entra UPNs that no current
FlightCheck config supplies (the existing `_check_workflows`
only exercises a single `WORKDAY_TEST_EMPLOYEE_ID`).
3. The static format check landed here already addresses the
legacy-ISU root cause (short-employee-id ISU format on a
federated tenant); a future `WD-WF-NNN` can wire the per-Worker
comparison once those inputs are formalised.
Fixes #82
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Addresses code-review feedback on PR #87: the previous control flow required runner.graph to be available before reading the ISU env var, so a tenant where Graph auth had failed but Dataverse worked would silently SKIP the WD-ENV-101 check and miss the no-`@` legacy-format detection - which is the most decisive signal this checkpoint produces and needs only the Dataverse value. Reordered: 1. Read ISU env var from Dataverse FIRST. 2. If no `@` -> WARNING (regardless of Graph availability). 3. Only if ISU is in `<x>@<domain>` shape, require Graph for the verified-domain comparison; SKIP with a partial-result message if Graph isn't available. 4. Existing Graph error / empty-org / empty-verifiedDomains paths unchanged. Added regression test test_warns_on_legacy_isu_format_even_when_no_graph_client that pins this behaviour and renamed the existing test_skipped_when_no_graph_client to test_skipped_when_no_graph_client_and_isu_in_upn_format to clarify it only covers the UPN-shaped path. Both tests now register the Dataverse mock since the ISU read happens before the Graph guard. All 29 flightcheck tests pass; full test suite (113 passed, 8 skipped) still green. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Two latent bugs in pp_admin_client.py (and supporting wiring in
cli.py) that prevented FlightCheck from producing accurate results
on the majority of tenants. Both bugs predate PR 87, but they gated
PR 87's new WD-ENV-101 check from running end-to-end and were
discovered while validating PR 87 against a real tenant.
1) BAP env id derivation
derive_environment_id() called Dataverse WhoAmI() and returned the
lowercased OrganizationId, treating it as the BAP environment id.
For most tenants the Dataverse OrgId and the BAP env id are
different GUIDs, so every BAP-scoped FlightCheck call (ENV-001/
002/003, EXT-001, flow listing, connection enumeration) hit a 404
from an env id BAP didn't recognise. pva_client._discover_gateway()
already had the correct pattern (list BAP envs, match on
linkedEnvironmentMetadata.instanceUrl); this commit ports that
pattern into pp_admin_client as find_environment_id_by_dataverse_url
and reworks derive_environment_id() to prefer the BAP-backed path
when a PPAdminClient is supplied. cli.py is reordered so env_id
derivation runs after pp_admin.authenticate().
2) Flow listing endpoint host + audience
get_flows() was hitting
api.powerapps.com/providers/Microsoft.ProcessSimple/.../v2/flows,
which returns 404 - that path doesn't exist on the PowerApps host.
The admin flow-listing endpoint lives on api.flow.microsoft.com
AND requires a different audience token
(service.flow.microsoft.com//.default). PowerApps-audience tokens
return 401 against api.flow.microsoft.com.
This commit:
* adds FLOW_BASE and FLOW_SCOPE constants
* rewrites authenticate() to acquire both PowerApps and Flow
audience tokens via the shared MSAL cache
* adds a flow_headers property and a use_flow_token=True kwarg
on _get / _get_all
* rewires get_flows() and get_flow() to use FLOW_BASE + the Flow
audience token
Validated end-to-end against a real tenant:
* Before: env_id = Dataverse OrgId; ENV-001/002/003 = SKIPPED;
EXT-001 = WARNING (404); Workday block silently gated out.
* After: env_id = BAP id (matched via instanceUrl);
ENV-001/002/003 = PASSED; EXT-001 reports 5 ServiceNow + 6 SAP
flows; Workday block runs (WD-001 = NotConfigured on this env,
which has no Workday flows - as expected).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PRE-001/002/003 in checks/prerequisites.py used case-sensitive `in`
substring checks against UPPER_CASE pattern strings (e.g. checking
for "MICROSOFT_365_COPILOT" inside skuPartNumber). Microsoft Graph
/v1.0/subscribedSkus returns skuPartNumber values in mixed case
(e.g. "Microsoft_365_Copilot"), so the matchers reported FAILED for
tenants that actually had the required licenses.
PRE-003 additionally only checked for the substring "TEAMS", which
missed bundle SKUs that include Teams without "TEAMS" in their name
(O365_BUSINESS_PREMIUM, ENTERPRISEPACK, SPE_E3/E5, etc.).
This commit:
* Adds a _sku_matches(sku, patterns) helper that lowercases both
sides before checking for a substring match.
* Adds module-level tuples covering the real SKU vocabulary:
- M365_COPILOT_SKUS - Microsoft 365 Copilot standalone + bundles
- COPILOT_STUDIO_SKUS - Studio standalone, plus M365 Copilot
bundle (which now grants Studio entitlements)
- TEAMS_BEARING_SKUS - explicit list of SKUs that include Teams
(E3/E5/F1/F3 family, Business Premium, M365 Apps, etc.)
* Updates PRE-001/002/003 to use the helper with the new tuples.
Bug predates PR 87 (introduced in commit 6d38f06, May 7) but
discovered while validating PR 87. Validated end-to-end against
a real tenant: PRE-001/002/003 now PASS where they previously
FAILED on a tenant with 50 M365 Copilot, 51 O365 Business Premium,
and other valid licenses.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Follow-up to the pp_admin_client.py fix. The earlier "bug 4" regression test in tests/flightcheck/test_pp_admin_client.py pinned a 404 from GET https://api.powerapps.com/.../v2/flows as expected behavior for "Dataverse-only environments." That was a misdiagnosis - the flow listing endpoint actually lives on api.flow.microsoft.com with its own audience token, so the 404 was a wrong-URL artefact, not an env-shape signal. This commit cleans up the test infrastructure to reflect the corrected endpoint: * tests/mocks/pp_admin.py - Add FLOW_BASE = "https://api.flow.microsoft.com" constant. - Update list_flows() and insufficient_permissions(endpoint="flows") to use FLOW_BASE. - Remove flows_resource_not_found() - the 404 it returned was from the wrong host, not a real "Dataverse-only env" response. * tests/flightcheck/test_pp_admin_client.py - Remove TestGetAll404Crashes::test_get_flows_raises_http_error_on_404. The bug it pinned (URL was wrong) is fixed at the source; the test is no longer meaningful. - Update the pp_client fixture to set both _token and _flow_token so tests that exercise get_flows() / get_flow() do not trip the "Call authenticate() first" guard on flow_headers. - Keep test_get_connections_handles_403_gracefully_returns_empty_list - still pinning real behavior on the connections endpoint. - Update the module docstring with a historical note explaining the removed test. * tests/fixtures/cassettes/INDEX.md - Split the "PowerApps Admin API (.../v2/flows)" row into two: a PowerApps row for connections (unchanged) and a Power Automate row for flows (host correction noted, cassette to be re-captured). - Remove the "intentional 404" explanation paragraph that rationalised the misdiagnosis. All 28 flightcheck tests pass; full test suite 113 passed 8 skipped. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Addresses code-review feedback on PR #87: remove all references to a specific customer name from the production check and its tests. The technical context (federated tenant, pre-UPN ISU provisioning) is preserved; only the customer identifier is removed. Replaces four sites: * checks/workday.py:310 - inline comment describing why the no-`@` path is reported off the Dataverse value alone. * test_workday_isu_username_format.py:18 - module docstring enumeration of the BAD path. * test_workday_isu_username_format.py:246 - docstring on test_legacy_short_id_format_warns. * test_workday_isu_username_format.py:419 - docstring on test_warns_on_legacy_isu_format_even_when_no_graph_client. All 11 WD-ENV-101 tests pass; no behaviour change. Commit-message references to the same customer name in commits 891537e and be61b5d were scrubbed by rewording those commits as de25052 and dfc9ee4 respectively in this branch's force-push. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…me-alignment-with-entra
9036bd7 to
1939c85
Compare
Reviewer (srideshpande) asked why we still call derive_environment_id
when pp_admin is None, and whether we should error out at that point.
derive_environment_id intentionally tolerates pp_admin=None and falls
back to the Dataverse WhoAmI/OrganizationId path so that operators
whose Power Platform sign-in failed (network issue, cancelled browser,
MSAL error - NOT "user lacks PP Admin role"; the role check happens
later per-API via 401/403) can still run the substantial fraction of
FlightCheck that doesn't need pp_admin:
- PRE-* (M365 Copilot / Copilot Studio / Teams license SKU checks)
- AUTH-* (Graph health)
- WD-ENV-* (Workday env var / ISU format - includes WD-ENV-101)
- WD-WF-* (Workday SOAP runtime tests against the live tenant)
- CONFIG-* (local agent / topic / variable / knowledge source)
That is roughly half of FlightCheck, and it's often the first stuff
the operator needs to fix anyway. Erroring out at env_id derivation
would block all of it behind a Power Platform sign-in problem the
operator may not even have a way to fix immediately.
The reviewer's underlying concern that the message is too vague is
valid, though. This commit replaces the single
"WARNING: Could not derive environment ID. Some checks may be limited."
with three distinct outcomes so the operator knows what happened and
which checks to expect:
1. env_id resolved via BAP (pp_admin OK and lookup matched):
"Environment ID: <id>"
2. env_id resolved via WhoAmI fallback (pp_admin sign-in failed):
"Environment ID: <id> (Dataverse OrganizationId fallback - Power
Platform sign-in failed, so BAP-scoped checks (ENV-*, EXT-*,
WD-CONN-*) will be skipped)"
3. Both paths failed:
"WARNING: Could not derive environment ID for <env_url>.
BAP-scoped checks (ENV-*, EXT-*, WD-CONN-*) will be skipped;
license, auth, Workday env-var, Workday SOAP, and local-file
checks will still run."
Also expanded the inline comment above the call to document the design
intent at the call site so future readers don't ask the same question.
Validation: 113 full-suite tests + 8 expected skips pass unchanged.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
srideshpande
approved these changes
May 21, 2026
…me-alignment-with-entra
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Implements the FlightCheck check requested in #82.
Checkpoint:
WD-ENV-101Category: Workday
Priority: HIGH
Conditional? No (runs whenever Workday flows are detected)
API tier:
documented(existingtests/mocks/dataverse.py, MS Learnenvironmentvariabledefinition/environmentvariablevaluereference)validatable(existingtests/mocks/graph.py, CSDLEntityType Name="organization"+https://learn.microsoft.com/graph/api/organization-get)Mock source: Reuses existing builders only — no new mock module, no registry update.
What this check does
Pulls
EmployeeContextRequestAccountNamefrom Dataverse and compares its shape against the tenant's verified Entra domains read from Microsoft Graph (runner.graph.get_organization()→verifiedDomains[].name). Federated tenants (Okta, Ping, ADFS) frequently leave the ISU username in a legacy short-employee-id format that doesn't match the UPN claim ESS sends on each request, which prevents Workday from matching the incoming request to a Worker — a common misconfiguration on tenants that provisioned the ISU before adopting UPN-shaped service-account naming.Heuristics:
@in the ISU username → Warning (legacy short-id format — the most-cited misconfiguration root cause).@present but the domain part isn't in the tenant's verified domains → Warning (could be legitimate cross-tenant federation; surface for the operator to confirm).<localpart>@<verified-domain>→ Passed.Tests
tests/flightcheck/checks/test_workday_isu_username_format.pycovers GOOD (UPN-shaped + verified-domain match, plus the initial*.onmicrosoft.comdomain), BAD (no-@legacy short-ID scenario, unverified-domain), defensive (empty verifiedDomains, empty org record, Graph raises), and SKIPPED paths (no Dataverse token, no Graph client, ISU env var unset — defers to WD-ENV-001).11 new tests pass; the existing 18 flightcheck tests still pass unchanged. Full repo suite: 113 passed, 8 skipped.
Verification context
The source issue did not include a
## Verificationsection, so the implementation pattern was inferred from the most-similar existing check:_check_env_varsin the same module (checks/workday.py) for the Dataverse half, andchecks/prerequisites.pyfor therunner.graphusage pattern. Test file mirrorstests/flightcheck/checks/test_workday_env_vars.py.End-to-end validated against a real tenant: ran flightcheck against a Workday-equipped environment with the Dataverse env var both UPN-shaped (PASSED) and deliberately set to a legacy short-id value (WARNED with the expected remediation).
What was deferred (and why)
The deeper
Get_Workers User_ID == Entra UPNper-Worker comparison from the suggested implementation is intentionally deferred to a follow-up:tests/mocks/workday.py) isMOCK_STATUS = "placeholder"and would need promotion tovalidatedagainstflightcheck_workday.yamlbefore any test could exercise that surface (per the conftest enforcement)._check_workflowsonly exercises a singleWORKDAY_TEST_EMPLOYEE_ID).WD-WF-NNNcan wire the per-Worker comparison once the Workday mock is promoted and the sample-list input is formalised.Additional fixes piggy-backed on this PR
While validating WD-ENV-101 end-to-end against a real tenant, three latent bugs in the original May 7 Python-scripts commit surfaced — each one blocked the new check (or other checks) from running on most tenants. They are fixed in this PR because the new WD-ENV-101 check could not be validated without them.
pp_admin_client.py+cli.py) —derive_environment_id()returned the Dataverse OrgId rather than the BAP env id (different GUIDs on most tenants), so every BAP-scoped check (ENV-001/002/003, EXT-001, flow listing, connection enumeration) 404'd. Now uses an instance-URL match against the BAP env list, mirroringpva_client._discover_gateway().pp_admin_client.py) —get_flows()was hittingapi.powerapps.com/.../v2/flows(404, path doesn't exist there) instead ofapi.flow.microsoft.com/.../v2/flowswith the requiredservice.flow.microsoft.com//.defaultaudience token. AddsFLOW_BASE,FLOW_SCOPE, dual-tokenauthenticate(), anduse_flow_token=Trueplumbing.checks/prerequisites.py) — PRE-001/002/003 used case-sensitiveinchecks against UPPER_CASE patterns while Graph returns mixed-case values (Microsoft_365_Copilot); also missed Teams bundles (O365_BUSINESS_PREMIUM,ENTERPRISEPACK, etc.). Added_sku_matches()helper + reference-list-grounded SKU tuples.Test-side follow-up (
tests/mocks/pp_admin.py,tests/flightcheck/test_pp_admin_client.py,tests/fixtures/cassettes/INDEX.md): aligned with the corrected Power Automate endpoint. The previous "bug 4" regression test pinning a 404 fromapi.powerapps.comas "expected for Dataverse-only environments" was a misdiagnosis — the 404 was a wrong-URL artefact — and has been removed.A companion PR (#95) documents the corresponding guidance updates to
flightcheck/AGENTS.mdso future agents don't reintroduce the same classes of bug.Fixes #82