
How to Write JSON Prompts for Nano Banana Pro
JSON-style prompts for Nano Banana Pro: what's documented by Google, what's community convention, common keys, and when JSON helps versus hurts.
JSON prompts for Nano Banana Pro (Gemini 3 Pro Image, model id gemini-3-pro-image-preview) have become a popular convention since the model launched in November 2025. JSON prompt generators, GitHub schemas, LinkedIn threads, and YouTube tutorials all promise "100% accuracy" or "92% precision" if you swap your descriptive paragraph for a structured JSON object.
The honest version is more interesting. Google itself does not document a JSON prompt format. The official guides recommend descriptive natural-language paragraphs. The community built JSON conventions on top of that, and the evidence on whether JSON actually changes outputs is mixed. This guide splits what Google has confirmed from what the community has proposed, lists the JSON keys you will see most often, and shows when reaching for JSON pays off.
TL;DR
- JSON prompting is a community convention, not an officially supported API format. Google's Nano Banana prompt guide recommends narrative paragraphs around five components: style, subject, setting, action, composition.
- The model parses natural language. Whatever JSON you send is converted to tokens like any other text. There is no JSON parser on the image side that maps
"lighting": "..."to a specific control. - JSON's real benefit is on the human side: reproducibility, batch generation, programmatic templating, and version control of prompt fields.
- Independent tests are mixed. Chase Jarvis's controlled comparison found JSON and natural-language outputs "essentially the same"; community guides report subjective accuracy gains.
- Don't confuse two different "JSON" features: Gemini's structured output /
responseSchemaconstrains the model's response to a JSON shape. That is a real, documented API feature, but it is for text responses, not for shaping an image prompt.
Why people reach for JSON prompts
Three reasons keep coming up across community write-ups:
- Clarity for the human writer. A flat paragraph mixes adjectives across subject, lighting, lens, and mood. JSON forces each adjective into a labelled slot, so you notice when you forgot to specify the lens.
- Reproducibility. Change a single field (say
"camera.lens": "85mm"to"24mm") and re-run with everything else fixed. Same reason designers keep tokens in JSON. - Batch generation. Template the JSON, fill
subject.product_namefrom a CSV, and serialize into thecontentsfield of the Gemini API call. The model still receives a string, but the producer side has a clean fan-out.
None of these require the model to "understand JSON." They are organizational benefits for the prompt author.
What Google actually documents
Google DeepMind's Nano Banana prompt guide and the Google AI for Developers image generation docs both push descriptive prose, not structured fields. The recommended structure is:
| Component | What it covers |
|---|---|
| Style | Photograph, illustration, watercolour, 3D render, etc. |
| Subject | Character or object — appearance, clothing, expression |
| Setting | Location, environment, era |
| Action | What the subject is doing in the frame |
| Composition | Shot type, angle, framing |
The Google AI for Developers docs explicitly say: "Describe the scene, don't just list keywords. The model's core strength is its deep language understanding. A narrative, descriptive paragraph will almost always produce a better, more coherent image than a simple list of disconnected words." That guidance does not change for Nano Banana Pro. The Gemini 3 Pro Image Preview model page refers back to the same image-generation guide.
The Google Developers Blog post on prompting Gemini 2.5 Flash Image repeats the same advice with templates like "A photorealistic [shot type] of [subject], [action or expression], set in [environment]...", which is flowing prose with placeholders, not JSON.
There is no Google-published JSON schema for image prompts and no documented prompt.lighting field that the API parses. Anyone telling you otherwise is selling you a community pattern, not an API contract.
The "JSON" feature Google does document
To avoid a common mix-up: Gemini's API does support structured output via responseSchema. That feature constrains what the model returns to a JSON object you define. It is widely used to extract fields from images, generate captions in a fixed shape, or pipe model outputs into downstream code. It is not the same as putting JSON in your image-generation prompt. It controls the response side rather than the prompt side, and does not apply to the image bytes themselves.
Common JSON keys observed in the community
Even though no schema is official, a handful of top-level keys recur across community proposals. The clearest catalogue is alexewerlof's GitHub gist, positioned as a "structured prompting schema for high-fidelity image generation." Marketing-focused write-ups like the Atlabs guide and the aiformarketings guide converge on similar fields.
Across those community sources, the keys you will see most often are:
| Key | Typical contents | Equivalent in Google's narrative model |
|---|---|---|
subject | Character or object: type, description, clothing, expression, pose | Subject |
style | Photograph, illustration, painting, render | Style |
setting / scene | Location, environment, time of day | Setting |
lighting | Source, direction, quality (golden hour, softbox, neon, etc.) | Folded into Style/Setting |
camera | Lens focal length, aperture, ISO, film stock | Folded into Style |
composition | Shot type, angle, framing, focus point | Composition |
palette | Colour scheme or dominant colours | Folded into Style |
mood | Emotional tone (serene, ominous, joyful) | Folded into Style |
text_rendering | In-image text content and typography | Covered separately by Google's text-in-images guidance |
negative / prohibitions | Things to avoid | Not officially supported. Nano Banana does not have a documented negative-prompt field |
Two cautions:
- The
negativefield is the leakiest. Google's docs do not document a negative-prompt mechanism for Nano Banana. The model reads the list as descriptive text like everything else, which sometimes biases the output toward the words you wanted to avoid. cameraparameters likeaperture: f/2.0are interpreted as descriptive text. The model isn't simulating an aperture. It has learned that "shot at f/2.0" correlates with shallow depth of field in its training data. The effect is real but statistical, not optical.
A worked example: JSON and its plain-English twin
Here is the same prompt in both formats. Both produce comparable results from Nano Banana Pro because, under the hood, both end up as a string of tokens passed to the same model.
JSON form (community convention):
{
"style": "editorial photograph",
"subject": {
"type": "woman, mid-30s",
"clothing": "charcoal wool coat, cream silk scarf",
"expression": "calm, looking off-camera"
},
"setting": "rain-slicked Tokyo side street at dusk, neon signs in the background",
"lighting": "ambient neon and a cool key light from camera-left",
"camera": {
"lens": "85mm",
"aperture": "f/1.8"
},
"composition": "medium shot, subject on the left third, vertical 9:16 framing",
"palette": "teal, magenta, deep blue with warm skin tones",
"mood": "contemplative, cinematic"
}Plain-English equivalent (the form Google's guide recommends):
An editorial photograph of a calm woman in her mid-30s, wearing a charcoal wool coat and a cream silk scarf, looking off-camera. She stands on a rain-slicked Tokyo side street at dusk with neon signs glowing in the background. Ambient neon mixes with a cool key light from camera-left. Shot at 85mm, f/1.8, medium shot framed vertically (9:16) with the subject placed on the left third. Teal, magenta, and deep blue dominate the palette against warm skin tones. The mood is contemplative and cinematic.
Both feed Nano Banana Pro the same information: wardrobe, location, lighting setup, lens, framing, palette, mood. The JSON version is easier to mutate programmatically; the prose version is easier to read aloud and matches Google's documented guidance. Both are valid. Neither has a magic accuracy multiplier.
When JSON helps and when it hurts
Helpful when:
- You are running batches. Hundreds of product shots, characters, or social posts where you want one variable to change per row. JSON in your codebase, serialized into the prompt at send time, is the right tool.
- You are versioning prompts in git. Diffs on a JSON object are readable; diffs on a paragraph rewrite are noise.
- You are working with a team. Designers and engineers can edit named fields without touching each other's prose.
- You build downstream tooling. A prompt builder UI, a CMS-driven generator, or a CLI tool maps cleanly to a JSON schema.
Hurts (or at best, neutral) when:
- You are running a one-off creative prompt. The friction of writing JSON for a single image rarely earns its keep. Prose is faster and reads more naturally to the model.
- You believe JSON gives stricter control. Chase Jarvis's controlled comparison extracted an image's description as JSON, translated it to natural language, and ran both versions through Nano Banana. He found "these images all look essentially the same, with the expected random variation." His take: JSON is "a placebo" that adds non-visual tokens (brackets, quotes, commas) and competes for the model's attention budget.
- You over-specify. Twenty nested fields where five would do dilutes the prompt. Models have finite context budgets; spending them on
"meta.guidance_scale": 7.5(which Nano Banana does not expose anyway) is wasted ink. - You rely on a
negativekey. As above, there is no documented negative-prompt API for Nano Banana. Your"negative": ["blurry", "extra fingers"]becomes literal text in the prompt, and may bias toward what you wanted to avoid.
The mental model worth holding: JSON is a human-side organizational tool, not a strict API. The model parses natural language. JSON helps you write natural language more methodically.
Tooling: how to use JSON prompts in production
The pattern in most community implementations:
- Author the JSON in code. Keep prompt templates as
.jsonfiles or typed TypeScript objects. - Serialize before sending.
JSON.stringify(promptObject, null, 2). The indentation is cheap tokens and helps if you ever paste into Google AI Studio for debugging. - Optional: prepend a natural-language summary. A
user_intentorsummaryfield written as a plain sentence ("An editorial portrait of a woman on a Tokyo street at dusk") gives the model the gist immediately, with the JSON fields as elaboration. - Send as the
contentsfield. The Gemini API expects text incontents. There is no "JSON mode" for image prompts; your serialized JSON is just a string. - Log the JSON and the output. Because the JSON is structured, you can index it later and find every prompt with
"camera.lens": "85mm".
Third-party tools like JSON Prompt Generator let non-developers author JSON visually, then copy the serialized output into Google AI Studio. They are front-ends for the same convention, not official Google products. For team workflows, build a small template layer: a JSON Schema for your team's prompt shape, validation that fails the build on missing fields, and a serializer that posts to the API.
Practical advice if you are starting today
- Don't switch one-off prompts to JSON. For ad-hoc creative work in the Gemini app, prose matches what Google documents and reads more naturally.
- Reach for JSON when templating. Anything you would put in a spreadsheet, put in JSON.
- Treat community schemas as starting points, not specs. Pick four or five keys you actually need (subject, setting, lighting, camera, composition is a common minimum) and ignore the rest.
- Test both formats on your own brief. Chase Jarvis's result is one test. Run your own A/B and default to whichever is easier to maintain.
- Don't oversell JSON to your team. "92% accuracy" sets expectations the model will not meet. The honest pitch is "easier to template, version, and batch."
The original Nano Banana (Gemini 2.5 Flash Image) and the current Nano Banana Pro both behave the same way: they parse natural language. JSON is a wrapper for your benefit, not theirs.
Frequently asked questions
Does Google officially support JSON prompts for Nano Banana Pro?
No. The official Nano Banana prompt guide and the Google AI for Developers image-generation docs recommend descriptive natural-language paragraphs covering style, subject, setting, action, and composition. JSON prompts are a community convention layered on top.
Will a JSON prompt produce a better image than the same prompt in prose?
The evidence is mixed. Chase Jarvis's controlled test found JSON and natural-language outputs essentially indistinguishable. Marketing-focused community write-ups claim accuracy gains in the 60-92% range, but those numbers are not from controlled studies and the methodologies are usually not published. Your best bet: A/B test both formats on your own brief.
What about responseSchema, isn't that JSON for Gemini?
That is a different feature. Structured output via responseSchema constrains the model's response to a JSON shape, useful for extracting structured data from images, generating captions, or piping outputs into code. It does not apply to image generation, where the output is a PNG, not a JSON object.
Does Nano Banana support negative prompts via a negative JSON key?
No documented support. Google's image-generation docs do not list a negative-prompt mechanism. If you put "negative": ["blurry"] in your JSON, the model reads it as descriptive text, which sometimes biases the output toward the words you wanted to avoid. Phrase what you want positively instead ("sharp, in-focus subject").
Are there official JSON schemas I should follow?
No official schemas. Community proposals like alexewerlof's gist and the pauhu/gemini-image-prompting-handbook repository are the most-cited starting points, but none are endorsed by Google. Pick the keys that map to your team's workflow and keep the schema as small as possible.
Where can I try JSON prompts on Nano Banana Pro?
The fastest way is Google AI Studio. Paste your serialized JSON into the prompt box and run. For production work, the Gemini API accepts the same string in its contents field. The studio at gptimg.co wraps the model in a browser UI with a free trial quota, useful for comparing prose vs JSON side by side without writing API code.
Sources
- Nano Banana prompt guide (Google DeepMind, official prompting guidance)
- Nano Banana image generation (Google AI for Developers)
- Gemini 3 Pro Image Preview model page (Google AI for Developers)
- How to prompt Gemini 2.5 Flash Image (Google Developers Blog)
- Structured output (
responseSchema) (Google AI for Developers) - Does JSON Prompting Actually Work? Tested with Nano Banana (Chase Jarvis)
- Nano Banana structured JSON prompt schema (alexewerlof, community proposal)
- Nano Banana Pro JSON Prompting Guide (Atlabs AI, community guide)
- Nano Banana JSON Prompt Format (aiformarketings, community guide)
- gemini-image-prompting-handbook (pauhu, open-source community schema)
Last reviewed against source pages: 2026-04-18. Google's documentation may add or change recommendations; confirm in the linked sources before standardizing on any pattern.
Autor

Kategorien
responseSchema, isn't that JSON for Gemini?Does Nano Banana support negative prompts via a negative JSON key?Are there official JSON schemas I should follow?Where can I try JSON prompts on Nano Banana Pro?SourcesWeitere Beiträge zum KI-Bildgenerator

How to Use Nano Banana for Free Online (No API Key Needed)
Three official ways to use Nano Banana free online — Gemini app, Google AI Studio, and browser-based studios — with quoted limits and zero API setup.


Does Nano Banana Make Videos? (No — Here's What to Use Instead)
Nano Banana is image-only per Google's docs. Here's what the model actually does, why third-party 'Nano Banana video' tools are wrappers, and what to use for video.


What Is Nano Banana 2? Gemini 3.1 Flash Image Explained (April 2026)
Nano Banana 2 is Google's Gemini 3.1 Flash Image model: Pro-quality output at Flash speed and price. Release date, features, pricing, and how it compares.
