Docker 29 Broke Our AWS ECR Deploys: The Missing Permission Nobody Told You About
This morning, we did two deployments 15 minutes apart. The first one passed. The second one failed with a cryptic 403 Forbidden error during docker push to Amazon ECR. Nothing in our code had changed. Here's what happened and how we fixed it.
UPDATE - February 13, 2026
An AWS engineer reached out with the actual root cause: the 403 Forbidden occurs when the caller lacks ecr:BatchGetImage permissions. Docker 29's containerd image store makes HEAD requests that require this permission, which older Docker versions didn't need.
To debug ECR permission failures: Check AWS CloudTrail for the specific permission denial — the failure codes there will tell you exactly which permission is missing.
The fix: Add ecr:BatchGetImage to your IAM policy. If you can't modify IAM permissions quickly (or want a workaround that doesn't require IAM changes), the crane-based solution below still works by bypassing the problematic manifest check entirely.
Why this post exists: CI/CD breaks between deploys with a cryptic 403, and you start pulling every thread. Docker version changed, manifest format changed, and apparently permissions too. A lot of moving pieces landing at once with no clear signal pointing to IAM.
GitHub announced the runner update, but it was buried in release notes — no banner in Actions, no heads-up where developers actually work. Then you've got this whole chain to debug: GitHub Actions → Docker → Laravel Vapor → ECR → IAM, where each layer can mask where the actual problem lives. Modern infrastructure has multiple choke points and failure modes that look identical from the outside.
For context: Laravel Vapor is a serverless deployment platform that sits on top of AWS Lambda and deploys Docker images to ECR. As of this writing, even Vapor's recommended IAM policy doesn't include the new required permissions yet.
Even opening an AWS support case didn't yield quick results. The goal was to get unblocked and move on, and the crane workaround accomplished that. Now we know the cleaner fix — add the IAM permissions.
The Symptom
Our GitHub Actions workflow builds a Docker image and pushes it to Amazon ECR. Overnight, the push step started failing:
unexpected status from HEAD request to
https://ACCOUNT.dkr.ecr.us-east-1.amazonaws.com/v2/REPO/manifests/IMAGE_TAG:
403 Forbidden
Every layer pushed successfully. The failure only happened at the final manifest step — a HEAD request to check if the manifest already existed returned 403 instead of the expected 404 (not found) or 200 (exists).
The Root Cause
GitHub Actions recently updated their ubuntu-latest runner image, upgrading Docker from 28.0.4 to 29.1.5 (actions/runner-images#13474).
Docker 29 made a significant change: containerd is now the default image store. This means Docker 29 stores and pushes images using OCI (Open Container Initiative) manifest format instead of the legacy Docker v2 Schema 2 format. ↳ This is true, but it's not why the push fails.
The problem? Docker 29's containerd makes a HEAD request to check if the manifest exists before pushing. This request requires ecr:BatchGetImage permission — a permission that older Docker versions didn't need for the push operation. Without it, ECR returns 403 Forbidden.
The simplest fix: Add ecr:BatchGetImage to your IAM role/policy. See the update above for details.
This is tracked in moby/moby#51532 and discussed in actions/runner-images#13474.
What We Tried (Before We Knew It Was Permissions)
We initially assumed this was a manifest format issue. These attempts all failed because we were solving the wrong problem. Documenting them here so you don't waste time on the same dead ends.
Attempt 1: Pin Docker Client to v27
We downloaded the Docker 27 static binary and placed it ahead of Docker 29 in $PATH:
- name: Install Docker 27 client
run: |
curl -sL https://download.docker.com/linux/static/stable/x86_64/docker-27.5.1.tgz | tar xz
cp docker/docker /usr/local/bin/docker
Result: Failed. The Docker client was v27, but the daemon was still v29 with containerd. The daemon controls the push, not the client. You can't fix a server-side behavior by swapping the client binary.
Attempt 2: --platform linux/amd64 Flag
Docker 29 introduced a --platform flag for docker push that forces a single-platform push instead of a multi-platform manifest index:
- name: Wrap docker push
run: |
REAL_DOCKER=$(which docker)
cat > /usr/local/bin/docker <<WRAPPER
#!/bin/sh
if [ "\$1" = "push" ]; then
shift
exec $REAL_DOCKER push --platform linux/amd64 "\$@"
fi
exec $REAL_DOCKER "\$@"
WRAPPER
chmod +x /usr/local/bin/docker
Result: Failed. This changes which manifest gets pushed, but doesn't affect the HEAD request that requires ecr:BatchGetImage. Still 403.
Attempt 3: Build-time Flags
We added several environment variables and build options:
env:
BUILDX_NO_DEFAULT_ATTESTATIONS: 1
DOCKER_DEFAULT_PLATFORM: linux/amd64
# On the build command:
--provenance=false
Result: These are useful for keeping the image clean (no attestation layers), but they don't change the manifest format used during push. Still 403.
The Fix: Bypass Docker Push with Crane
The solution was to stop using docker push entirely. Instead, we use crane, a lightweight Go tool for interacting with container registries.
The approach:
docker saveexports the image as a standard Docker v2 tarballcrane pushsends that tarball to the registry using Docker v2 Schema 2 manifest format
Here's the workflow step:
- name: Wrap Docker push for ECR OCI compatibility
run: |
# Docker 29 containerd pushes OCI manifests that ECR rejects with 403
# Bypass docker push entirely: save image to tarball, push with crane
CRANE_VERSION=v0.20.3
curl -sL "https://github.com/google/go-containerregistry/releases/download/${CRANE_VERSION}/go-containerregistry_Linux_x86_64.tar.gz" \
| tar xz -C /usr/local/bin crane
echo "crane $(crane version)"
REAL_DOCKER=$(which docker)
cat > /usr/local/bin/docker <<WRAPPER
#!/bin/sh
if [ "\$1" = "push" ]; then
shift
# Parse image reference (skip flags like --platform)
IMAGE=""
for arg in "\$@"; do
case "\$arg" in
--*) ;;
*) IMAGE="\$arg" ;;
esac
done
echo "=== crane push workaround for Docker 29 / ECR OCI ==="
echo "Image: \$IMAGE"
TMPTAR=\$(mktemp /tmp/image-XXXXXX.tar)
$REAL_DOCKER save "\$IMAGE" -o "\$TMPTAR" && crane push "\$TMPTAR" "\$IMAGE"
RC=\$?
rm -f "\$TMPTAR"
exit \$RC
else
exec $REAL_DOCKER "\$@"
fi
WRAPPER
chmod +x /usr/local/bin/docker
echo "Docker wrapper with crane push installed."
working-directory: .
Why This Works
docker savealways produces a Docker v2 format tarball, regardless of the storage backend. Even with containerd, the save command outputs the standard Docker archive format.crane pushreads that tarball and pushes it to the registry using Docker v2 Schema 2 media types — the format ECR has supported for years.- The wrapper transparently intercepts
docker pushcommands while passing all other Docker commands (build, login, tag, etc.) through to the real Docker binary. cranereads Docker's credential store (~/.docker/config.json), so ECR authentication works automatically afterdocker login.
Why a Wrapper Script?
If you control the push command directly, you could just call crane push yourself. But many deployment tools (Laravel Vapor, Terraform, Pulumi, etc.) run docker push internally. The wrapper lets you fix the push behavior without modifying the deployment tool.
GitHub Actions ECR Authentication
This workaround assumes you're already authenticating to ECR in your GitHub Actions workflow. The standard setup uses the aws-actions/amazon-ecr-login action, which populates Docker's credential store:
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::123456789012:role/GitHubActionsRole
aws-region: us-east-1
- name: Login to Amazon ECR
uses: aws-actions/amazon-ecr-login@v2
The ECR login action writes credentials to ~/.docker/config.json, which crane reads automatically. No additional authentication setup is needed — the wrapper just intercepts docker push and uses the same credentials that were already configured.
Key Takeaways
1. Docker 29 requires new IAM permissions for ECR
The switch to containerd changes how Docker checks manifests before pushing. The manifest format breaks ECR. ↳ Actually, it requires ecr:BatchGetImage permission that older versions didn't need. If you push to ECR from GitHub Actions, you may be affected without warning when your runner image updates.
2. The 403 error is a permissions issue
Docker 29's containerd requires ecr:BatchGetImage permission for the manifest HEAD request. Check CloudTrail for the specific denial. Add this permission to your IAM policy, or use the crane workaround below.
3. Client-side fixes don't work
The Docker daemon controls the push, not the client binary. Swapping the CLI doesn't change the daemon's behavior.
4. --platform doesn't change the manifest format
It selects a single platform from a multi-platform index, but the manifest is still OCI format.
↳ This is true but irrelevant — the issue is permissions, not manifest format.
5. crane is a reliable escape hatch
When Docker's push behavior doesn't work with your registry, crane gives you explicit control over what gets pushed and in what format.
Timeline
| Time | Event |
|---|---|
| 9:00 AM | First deploy succeeds — Docker 28.0.4 on runner |
| 9:15 AM | Second deploy fails with 403 Forbidden — runner image updated to Docker 29.1.5 between builds |
| +1 hour | Identified Docker 29 containerd change as root cause; Docker 27 client pin and --platform flag attempted — both failed |
| +2 hours | crane push workaround deployed and confirmed working |
When Can You Remove the Workaround?
Immediately — if you add ecr:BatchGetImage to your IAM policy.
The crane workaround was developed before we knew the root cause. Now that we know it's a permissions issue, the fix is straightforward: update your IAM role to include the permission Docker 29's containerd needs.
If you're using the crane workaround and want to remove it:
- Add
ecr:BatchGetImageto your ECR push IAM policy - Remove the crane wrapper script from your CI/CD
- Test a push with native
docker push
References
- moby/moby#51532 — Docker 29 OCI manifest push issues
- actions/runner-images#13474 — Docker 29 rollout on GitHub Actions
- google/go-containerregistry (crane) — The tool that saved our deploys
- OCI Image Manifest Specification — The standard Docker 29 now uses by default
CI/CD Pipeline Breaking? Let's Fix It.
Infrastructure issues like this can cost days of debugging. A fractional CTO brings the experience to diagnose these problems quickly and implement robust solutions.
Get Help With Your Infrastructure