
Using Nano Banana for E-commerce Product Mockups (Workflow Guide)
A documented workflow for generating Shopify, Amazon, and Etsy product mockups with Nano Banana and Nano Banana Pro — references, prompts, and platform sizes.
Studio shoots are still the gold standard, but they scale worst when a small team needs to ship 30 SKU listings before next Monday. Google DeepMind's Nano Banana family (Nano Banana, Nano Banana 2, and Nano Banana Pro) is the AI model e-commerce sellers reach for first because two of its documented capabilities map directly onto product mockups: multi-image reference blending (up to 14 objects) and identity consistency (up to 5 people). This guide walks the documented workflow end to end, with the per-platform sizes you need to hit. Where a number comes from Google DeepMind or a marketplace help center, it is cited; third-party figures are labeled as such.
TL;DR
- Use case: skip the studio for variants, on-model lifestyle, and packaging mockups; keep it for hero shots that need full physical accuracy.
- Workflow: upload a flat-lay plus an on-model or scene reference, then prompt the scene. Nano Banana Pro accepts up to 14 reference images and up to 5 people in a single workflow (Google DeepMind).
- Platform sizes: Shopify recommends 2048×2048 (max 5000×5000 / 20 MB), Amazon best at 2000×2000 (1600 px+ for zoom), Etsy at least 2000 px on the long side.
- Pro vs regular: Pro for hero shots and packaging text, standard Nano Banana for high-volume color and angle variants.
- Limitation: Pro is reported at ~94% in-image text accuracy (source). Proofread label copy before publishing.
The use case: where AI mockups actually replace a studio
Three categories of e-commerce shot are the natural fit for the Nano Banana workflow:
- Variant shots. You have one clean shot of a t-shirt in white and need the same shirt in olive, charcoal, dusty rose, and navy. A re-shoot is overkill; an AI variant pass is the entire job.
- On-model lifestyle. You have a flat-lay of a hat, but the listing converts better with a model wearing it on a bright street. Casting and shooting costs more than a small Etsy seller's monthly ad budget.
- Packaging mockups. The product label exists as a vector file. You need a photo of the bottle on a kitchen counter for the Shopify hero, and a cardboard prototype with a phone shot will not look finished.
Where AI mockups still struggle: hero shots of items where the buyer expects a literal rendering of the exact physical product, like a one-of-one ceramic mug, a hand-bound book, or fine jewelry. For those, the workflow can produce lifestyle and context shots, but the primary photo should still come from a camera.
The documented workflow
Three Nano Banana Pro capabilities documented by Google DeepMind carry the workflow: multi-image blending of up to 14 reference images, identity consistency for up to 5 characters, and native output at 1K, 2K, or 4K. Together they make a repeatable mockup pipeline.
Step 1: Build a reference set
Upload, in this order:
- Product flat-lay: clean, evenly lit, neutral background. The source of truth for color, texture, silhouette.
- Packaging or label asset: vector logo or label PNG. Keep this separate rather than asking the model to invent the wordmark.
- Scene or on-model reference: a photo of the environment you want (kitchen counter, marble shelf, model torso). Style guidance, not literal copying.
Three inputs leave headroom under the 14-image ceiling for a color swatch, a props image, or a second product angle. The more specific the inputs, the less the model invents.
Step 2: Prompt the scene, not the product
Once the reference set is uploaded, the prompt describes the scene: lighting, camera, composition, mood. Do not re-describe the product; the references already encode it. A working pattern:
"Place [product from reference 1] on a sunlit white marble counter. Soft late-afternoon side light from the left. Natural depth of field, 50mm equivalent, eye level. Linen napkin from reference 4 lightly draped behind. Maintain the exact label and color from references 1 and 2."
The explicit "maintain" clause is the cue the model uses to lock identity from references rather than freely re-imagining.
Step 3: Regenerate variants with the same product locked
One accepted scene becomes a whole series: same product, different scene (counter to bathroom shelf to outdoor table), different lighting (golden hour to studio softbox to moody backlit), or different model (swap the on-model reference, keep the product references identical). The product is the constant; the scene is the variable.
Step 4: Finalize at the right resolution per platform
Nano Banana Pro generates natively at 1K, 2K, or 4K (Google DeepMind). Pick the output tier based on where the image will land. Sizes follow.
Resolution requirements per platform
These specs come from the marketplaces' own help centers (or, where the help center is paywalled behind a seller login, from documentation that quotes the official figures).
Shopify
| Spec | Value |
|---|---|
| Maximum dimensions | 5000 × 5000 px |
| Maximum file size | 20 MB |
| Recommended square size | 2048 × 2048 px |
| Zoom requires at least | 800 × 800 px |
| Best file type (per Shopify) | PNG, then JPEG |
Source: Shopify Help Center, Product media types. The Shopify Help Center states that a consistent aspect ratio across featured images keeps the gallery from jumping as the buyer flicks through.
Generating at Nano Banana Pro's 2K tier and exporting at 2048 × 2048 hits the recommended size with no upscaling. 4K only buys headroom for print or hero banners with crop room.
Amazon
| Spec | Value |
|---|---|
| Minimum on the longest side | 500 px |
| Zoom function enabled at | 1000 px on the longest side |
| Recommended for best zoom | 1600 px+ on the longest side |
| Maximum on the longest side | 10,000 px |
| Maximum file size | 10 MB |
| Product fill | At least 85% of the image |
| File formats | JPEG, PNG, TIFF, GIF (no animated) |
Sources: the Amazon Seller Central guide on Product image requirements (login-walled), quoted in seller-platform summaries such as Seller Labs and Soona.
The 85% fill rule trips up AI mockups most often: the airy lifestyle shots that look beautiful on Instagram get bounced from the Amazon main image. Prompt the scene tightly cropped, or run a separate near-square-fill generation for the main slot. The widely cited "best" Amazon size is 2000 × 2000, comfortably above the 1600 px zoom threshold and within Nano Banana Pro's 2K tier without upscaling.
Etsy
| Spec | Value |
|---|---|
| Recommended dimensions | At least 2000 px on the long side |
| First-photo minimum to avoid lower search rank | 635 px wide |
| Practical seller standard | 2000 × 2000 or 2700 × 2025 |
| Recommended orientation | Square or horizontal (1:1 desktop, 4:3 general) |
Source: Etsy Help Center, Requirements and Best Practices for Images.
Etsy emphasizes color accuracy and tells sellers to convert images to sRGB before upload. If you generate a mockup, confirm the color profile is sRGB before exporting.
When to use Pro for hero shots vs regular for variants
Pricing drives this trade-off. Third-party API aggregators put Nano Banana Pro at roughly $0.13 per image at 1K/2K and $0.24 at 4K; standard Nano Banana / Nano Banana 2 generally runs cheaper at the cost of slightly less consistency and lower text accuracy. Google DeepMind does not publish a numeric per-image figure, so treat aggregator numbers as indicative.
A workable rule:
- For a hero shot, packaging mockup, or anything with on-image text, use Pro at 2K or 4K. The 94% claimed text accuracy (source) matters most when the buyer will literally read the bottle.
- For color and angle variants of an approved hero, use standard Nano Banana or Nano Banana 2 at 1K or 2K. Identity-lock is good enough for variants the buyer only glances at.
- For lifestyle scenes for ad creative, either tier works; default to Pro for A/B at high resolution.
For a deeper price-per-pixel comparison, see Nano Banana Pro vs GPT Image.
Limitations to plan around
Three documented or widely reported limitations matter for e-commerce:
-
Packaging text is "claimed 94%, never 100%." Nano Banana Pro's headline text accuracy is ~94% across multiple languages (source). For a label that says "ORGANIC LAVENDER 100ml," 94% means the model gets it right most of the time and quietly misspells "lavendar" or "100mI" the rest. Always proofread at full resolution.
-
Brand-text legibility degrades on small or curved surfaces. Even with correct spelling, fine print on a curved bottle or ingredient panel can render as plausible-but-illegible glyphs. Fix: composite the real label PNG into the AI scene as a post-step.
-
"Unnatural results, visual artifacts, or disjointed scenes" on advanced edits and blending, explicitly noted by Google DeepMind. Probability rises when you stack many references, ask for many edits in one prompt, or push the model into a scene very different from the reference inputs. Mitigation: surgical prompts, one change per round.
Workflow integration with the studio
AI mockups do not replace a product photographer. They replace the second, third, and fortieth shoot: the long tail of variants, lifestyle context shots, and seasonal refreshes that a real shoot can no longer cost-justify.
A pipeline a small team can run today:
- One real photo session per SKU: clean flat-lay, 360° turn, one detail crop. This is the source of truth.
- Generate variants and lifestyle in the studio. Upload the flat-lay plus brand assets, prompt scenes per the workflow above, regenerate per platform aspect ratio.
- Composite labels for accuracy. Run the AI scene first, then composite the real label PNG over the bottle in any image editor. This recovers the 6% of the time the model would have miswritten the wordmark.
- Export per platform. Square 2048 for Shopify, square 2000 for Amazon and Etsy.
- Re-review every quarter. Model capabilities and marketplace specs move. Confirm against the source links before a new launch.
FAQ
What resolution should I generate for Shopify product images?
Generate at 2K and export to 2048 × 2048, the size Shopify documents as the typical best display for square product images. Generate at 4K only if the same asset will also serve print or a large hero banner.
Is the workflow safe for Amazon's main image rules?
Only if you respect the 85% fill rule. AI lifestyle shots leave too much room around the product, which fails the main-image check. Prompt a tighter crop (or crop in post) for the main image; reserve roomier scenes for secondary positions where Amazon allows context. Source: the Amazon Seller Central guide on product image requirements, summarized by Seller Labs.
How accurate is in-image text on packaging mockups?
Nano Banana Pro is reported at roughly 94% in-image text accuracy across multiple languages (source). Best in the current category, but not 100%. Proofread at full resolution, and for any wordmark that must match exactly, composite the vector label over the AI scene as a post-step.
Try the workflow
The fastest way to evaluate this end to end is to run the documented workflow against one of your own products. Try it at /nano-banana-pro: upload a flat-lay, a label asset, and a scene reference, prompt the scene, regenerate variants. Compare the output against your last studio invoice before deciding which parts of the pipeline to keep in-house.
Sources
- Gemini 3 Pro Image (Nano Banana Pro), Google DeepMind product page.
- Shopify Help Center, Product media types.
- Amazon Seller Central, Product image guide (login-walled).
- Seller Labs, Amazon Product Image Requirements 2026.
- Soona, Amazon Image Size Specs.
- Etsy Help Center, Image requirements and best practices.
- AI Free API, Nano Banana Pro capabilities reference (third-party 94% text-accuracy claim).
Last reviewed against source pages: 2026-04-18. Marketplace specs and model capabilities change periodically; confirm in the linked sources before acting on the numbers above.
Autor

Categorías
Más artículos sobre generación de imágenes con inteligencia artificial

What Is Nano Banana Pro? Gemini 3 Pro Image Explained (April 2026)
Nano Banana Pro is Google DeepMind's Gemini 3 Pro Image model. Native 4K, ~94% text accuracy, web grounding, and per-image pricing, fully explained.


ChatGPT Image Limit Explained: Every Plan in 2026 (Free, Go, Plus, Pro, Business, Enterprise)
The real ChatGPT image limit for every plan in 2026: what OpenAI publishes, what it does not, and what to do when you hit the cap. With API pricing, rate limits, and the DALL·E 3 retirement timeline.


How to Use Reference Images in Nano Banana (Subject Identity Guide)
Official guide to Nano Banana reference images: how many you can upload per model, character vs object slots, and Google DeepMind's documented best practices.
