How an attacker could poison a company's AI model by claiming a storage bucket before the SDK checked who owned it
Your company's AI model upload tool found a storage bucket with the right name and assumed it belonged to you. It didn't. An attacker had claimed it first, and the tool never checked ownership before handing over your model files.
That single missing check opened a four-step chain: the attacker predicted your storage bucket name from your public project ID, registered that bucket in their own cloud account, waited for your model to land there, and swapped it with a poisoned version in under 1.4 seconds. When your AI platform loaded the replacement, it executed the attacker's code inside Google's own managed infrastructure, using a Google-managed service account with broad access to your cloud environment.
The detail that should keep security leaders awake: the attacker needed no password, no phishing email, and no foothold inside your environment. They needed only a Google Cloud account of their own and your project ID, which is routinely embedded in public GitHub repositories and client-side JavaScript. The attack was reported to Google on March 5, 2026, and publicly disclosed by Palo Alto Networks Unit 42 on June 16, 2026, under the name "Pickle in the Middle."
Narrative · 6 min read
The Context
Vertex AI is Google Cloud's managed platform for building, training, and deploying machine learning models. Organizations use it to take a trained model (a file containing the mathematical weights that make an AI system work) and serve it as a live API endpoint that their applications can call. To move a model from a developer's machine into Vertex AI's serving infrastructure, the platform's Python SDK uses Google Cloud Storage (GCS) as an intermediate staging area: the SDK uploads the model file to a storage bucket, and Vertex AI's backend then reads it from there to load it into a container.
The vulnerability lived in that handoff step, specifically in how the SDK chose which storage bucket to use when the developer did not specify one.
The Attack, Phase by Phase
Phase 1: Reconnaissance and Bucket Pre-Registration
The attacker's first task was to predict the name of the storage bucket the victim's SDK would try to use. The SDK constructed that name using a deterministic formula based on the victim's GCP project ID and region, producing names like project-vertex-staging-region. GCP project IDs are frequently public: embedded in GitHub configuration files, client-side JavaScript, or API responses. Once the attacker had the project ID, they could calculate the expected bucket name, then create a bucket with that exact name inside their own GCP account and configure it to accept uploads from the victim's SDK.
Phase 2: Artifact Interception via Silent Upload Hijack
When a developer at the victim organization called Model.upload() without specifying a custom staging bucket, the SDK constructed the default bucket name, checked whether a bucket with that name existed globally, found one (the attacker's pre-registered bucket), and uploaded the model files there. The SDK in affected versions 1.139.0 through 1.147.x performed no ownership check. From the developer's perspective, the upload completed successfully. No error, no warning.
Phase 3: Race-Window Model Swap
The attacker had configured a Cloud Function (a small piece of code that runs automatically in response to an event) to fire the moment any file landed in their bucket. Google Cloud Storage sends an object-finalize event the instant an upload completes. The attacker's function received that event and replaced the legitimate model file with a malicious one in approximately 1.4 seconds. Vertex AI's backend needed roughly 2.5 seconds after the upload before it read the file to load it into a serving container. The attacker's swap happened well inside that window.
Phase 4: Serving-Time Code Execution and Post-Exploitation Pivot
When the victim deployed the model to a Vertex AI endpoint, Google's internal P4SA service account loaded the poisoned model file using joblib.load(). Because the file was a malicious pickle payload, Python executed the attacker's embedded instructions at load time. The payload reached out to the serving container's internal metadata server and retrieved the P4SA's OAuth token, a credential with broad access to the victim's cloud project.
With that token, the researchers demonstrated three post-exploitation capabilities: reading trained model weights from other deployments in the same project (proprietary intellectual property), enumerating all BigQuery datasets, table names, and access control lists (sensitive data inventory), and extracting internal infrastructure details from Cloud Logging, including internal Kubernetes cluster names, container image paths, and Dockerfiles.
What Made This Possible
-
No ownership check on an assumed resource. The SDK assumed that a bucket with the right name was the right bucket. In a namespace that is globally shared across all Google Cloud customers, that assumption is an open invitation to pre-registration attacks.
-
Predictable naming from public inputs. The bucket name formula used the project ID and region, both of which are routinely public. A naming convention that an attacker can calculate from a GitHub search is not a secret.
-
Pickle deserialization as a code execution primitive. The ML ecosystem's standard serialization format executes arbitrary code at load time. Any point in a model pipeline where an untrusted file can be substituted becomes a remote code execution opportunity. The P4SA's broad permissions then amplified a single file swap into full tenant reconnaissance.
This was the second predictable-bucket-name flaw to surface in Vertex AI in 2026. A separate flaw in the Vertex AI Experiments workflow (CVE-2026-2473) was patched in February 2026, just two weeks before this report was filed. Focal Security independently found the same root class of flaw in two other GCP products (Gemini Enterprise and Cloud Run) in the same period. The pattern is systemic, not accidental.
What Should Have Stopped This
Every defense that would have reduced the blast radius shares one trait: it does not depend on the storage bucket's name being correct. It verifies something the attacker cannot fake.
- Bucket ownership verification before use. This is exactly what Google's permanent fix in SDK v1.148.0 added: the SDK now confirms it owns the bucket before writing to it. Any SDK that auto-creates or auto-selects a shared resource should verify ownership, not just existence.
- Explicit staging bucket configuration. Developers who specified a custom
staging_bucketparameter were never exposed. Requiring an explicit value, rather than allowing a default, forces a deliberate choice and eliminates the predictable-name attack surface. - Least-privilege service accounts. The P4SA's broad
cloud-platformscope turned a single file swap into access to model weights, BigQuery metadata, and internal infrastructure. A service account scoped only to the specific bucket and endpoint it needs would have contained the blast radius to that one deployment. - Model artifact integrity verification. Treating model files as executable code (which pickle makes them) means they should carry cryptographic signatures, verified before loading. If the serving infrastructure had checked a signature, the swapped file would have been rejected before
joblib.load()was ever called.
The Takeaway
This attack is the same class of failure as the Stryker Intune wipe: a trusted management tool was weaponized against the organization it was built to serve, because the tool assumed that a resource with the right name was a safe resource. In the Stryker case, the tool was a device management platform. Here, it was an AI model upload SDK. The failure class is identical: trust granted by name, not by verified ownership.
The attack also connects to the Axios supply chain incident: in both cases, a build-time or pipeline-time artifact substitution produced code execution at the moment a trusted system consumed the artifact. The meta-pattern across all three: systems fail when they trust a boundary the attacker controls, whether that boundary is a device name, a package name, or a storage bucket name.
Pattern to remember: When a developer tool auto-generates a resource name from predictable inputs and uses that resource without verifying ownership, anyone who can predict the name can own the resource first.
What changed: AI model files are now a code execution surface in their own right. Any pipeline step that loads a model artifact from a location an attacker could influence is functionally equivalent to running untrusted code, and the blast radius is determined entirely by the permissions of the service account that does the loading.
Technical Deep Dive · 3 min
The Technical Mechanism
The vulnerability lives in the Model.upload() method of the google-cloud-aiplatform Python SDK, specifically in the staging bucket resolution logic inside gcs_utils.py. When staging_bucket is not provided, the SDK calls an internal helper that constructs a default bucket name using the string template {project_id}-vertex-staging-{region}. The helper then calls the GCS buckets.get API to test for existence. In affected versions (1.139.0 through 1.147.x), a 200 response from buckets.get was treated as sufficient authorization to write to the bucket. The SDK did not call buckets.getIamPolicy or any ownership-asserting API to confirm the authenticated principal had created or owned the bucket.
Because GCS bucket names are globally unique across all tenants, an attacker who pre-registers the expected name in their own project receives a 200 from buckets.get when the victim's SDK queries it. The SDK proceeds to upload model artifacts (typically model.joblib or TensorFlow SavedModel directories) to the attacker-controlled location.
The attacker deploys a Cloud Function subscribed to the google.storage.object.finalize event on their bucket. Unit 42's proof of concept showed the function completing a file replacement in approximately 1.4 seconds. Vertex AI's internal P4SA reads the artifact within roughly 2.5 seconds of upload completion, creating a reliable race window.
The malicious payload uses Python's pickle module's __reduce__ protocol. When joblib.load() deserializes the file, Python calls the __reduce__ method, which returns a callable and arguments that Python executes unconditionally. The proof-of-concept payload issued an HTTP GET to the GCE metadata server at http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token with the Metadata-Flavor: Google header, retrieving the P4SA's OAuth 2.0 access token. That token carried the https://www.googleapis.com/auth/cloud-platform scope, enabling the demonstrated post-exploitation actions.
The partial fix in SDK v1.144.0 (March 31, 2026) appended a uuid4 suffix to the default bucket name, breaking the deterministic naming formula. The permanent fix in v1.148.0 (April 15, 2026) added an explicit ownership check: before using any existing bucket, gcs_utils.py now verifies that the authenticated principal is listed as an owner in the bucket's IAM policy.
The same root flaw (predictable bucket name, no ownership verification) was independently confirmed in the Vertex AI Experiments workflow as CVE-2026-2473, patched in SDK v1.133.0 in February 2026, and in two additional GCP products (Gemini Enterprise as CVE-2026-1727, and Cloud Run as "MountSquat") reported by Focal Security.
CVE and Advisories
No CVE has been assigned to the "Pickle in the Middle" vulnerability (the Model.upload() path) as of the June 16, 2026 public disclosure date. Neither Unit 42 nor Google's Vertex AI security bulletins list a CVE identifier for this specific issue.
The predecessor vulnerability in the Vertex AI Experiments workflow is tracked as CVE-2026-2473 and documented in Google Cloud's Vertex AI Security Bulletins (GCP-2026-012), published February 20, 2026.
The related Gemini Enterprise bucket-squatting flaw is tracked as CVE-2026-1727, also documented in Google Cloud's security bulletins.
MITRE ATT&CK Mapping
| Technique ID | ATT&CK name | How it appeared |
|---|---|---|
| T1195 | Supply Chain Compromise | The attacker intercepted model artifacts in transit between the developer's environment and the AI serving platform by pre-registering the expected storage location. |
| T1525 | Implant Internal Image | The malicious pickle payload was substituted for a legitimate model artifact that the platform's internal service account then loaded and executed. |
| T1552.001 | Credentials in Files | The payload retrieved OAuth credentials from the container's metadata server, exploiting the implicit credential availability in managed cloud execution environments. |
| T1078.004 | Valid Accounts: Cloud Accounts | The stolen P4SA OAuth token was used to authenticate subsequent API calls for model theft, BigQuery enumeration, and Cloud Logging reconnaissance. |
| T1530 | Data from Cloud Storage Object | Post-exploitation access included reading model artifacts (trained weights) from other deployments in the same tenant project. |
| T1619 | Cloud Storage Object Discovery | The attacker used the compromised P4SA token to enumerate BigQuery datasets, table schemas, and access control lists across the victim's project. |
Indicators of Compromise
Unit 42 and Google confirmed no exploitation in the wild was observed prior to public disclosure on June 16, 2026. No network indicators or file hashes from active campaigns have been published.
Detection is structurally difficult because the attack produces no anomalous authentication events on the victim's side: the SDK's upload to the attacker's bucket uses the victim's own credentials writing to what appears to be a valid GCS endpoint. The model swap occurs entirely within the attacker's GCP project.
Detection Approaches
- Audit GCS bucket ownership for any bucket matching the pattern
{project_id}-vertex-staging-{region}. If a bucket with that name exists but is not owned by your project, your SDK may have uploaded to it. - Review Cloud Logging for
joblib.load()calls in Vertex AI serving containers followed by unexpected outbound HTTP requests to external endpoints or the metadata server from within model-serving processes. - Monitor P4SA token usage for API calls outside the expected Vertex AI serving scope, particularly
bigquery.datasets.list,logging.entries.list, orstorage.objects.getcalls on buckets outside the model registry. - SDK versions below 1.148.0 should be treated as indicators of potential exposure in any environment where
Model.upload()was called without an explicitstaging_bucketparameter.
Attribution
This is a vulnerability research disclosure, not a tracked threat actor campaign. The vulnerability was discovered by Palo Alto Networks Unit 42 and reported to Google through the Vulnerability Reward Program on March 5, 2026. Unit 42 stated no exploitation in the wild was observed. No nation-state or criminal group has been linked to active use of this technique.
Primary Sources
- 01.Pickle in the Middle: Hijacking Vertex AI Model Uploads for Cross-Tenant RCE
Palo Alto Networks Unit 42 · June 16, 2026
- 02.Google Vertex AI SDK Flaw Let Attackers Hijack Model Uploads via Bucket Squatting
The Hacker News · June 16, 2026
- 03.Vertex AI Security Bulletins (GCP-2026-012, CVE-2026-2473)
Google Cloud · February 20, 2026
- 04.Kicking the Bucket: Critical RCE and Cross-Tenant Exploits in 3 Different GCP Products
Focal Security · February 2026
- 05.Google Cloud Vertex AI Vulnerability Lets Attackers Take Over and Poison AI Models
GBHackers · June 17, 2026
- 06.Google's Vertex AI SDK could allow RCE through bucket squatting
CSO Online · June 17, 2026