How to Use the AI/LLM Node in n8n (Gemini)
Same as above for Google Gemini node.
How to Use the AI/LLM LLMClick to read the full definition in our AI & Automation Glossary. Node in n8n (Gemini)
The Gemini AI node in n8n connects your workflows to Google's Gemini models. This guide shows you exactly how to configure it, what settings matter, and how to avoid the common mistakes that waste tokens and produce garbage output.
What You Need Before Starting
n8n instance running version 1.0 or later. The Gemini integration requires the modern AI node architecture. If you're on 0.x versions, upgrade first.
Google Cloud project with Gemini API enabled. Go to console.cloud.google.com, create a project, enable the Generative Language API
Clear use case definition. Know exactly what input you're sending and what output format you need. Vague requirements produce vague results.
Step 1: Add and Configure the Gemini Node
Drag the "Google Gemini Chat Model" node onto your canvas. Do not use the generic LLM
Open the node settings. Under "Credential to connect with," click "Create New Credential." Paste your API
Select your model. For most professional services work:
- gemini-1.5-pro: Best for complex reasoning, document analysis, long context (up to 1M tokens). Use this for contract review, research synthesis, multi-document comparison.
- gemini-1.5-flash: Faster, cheaper, good for simple classification, data extraction, routine responses. Use this for intake form processing, basic Q&A, sentiment analysis.
Set your temperature between 0.0 and 1.0:
- 0.0-0.3: Deterministic, consistent output. Use for data extraction, classification, anything requiring reliability.
- 0.4-0.7: Balanced creativity and consistency. Use for client communications, content drafting.
- 0.8-1.0: Maximum creativity. Rarely useful in professional services. Avoid unless you're brainstorming.
Step 2: Structure Your Input Properly
The Gemini node accepts a "messages" array. Each message needs a role (system, user, or assistant) and content.
System message sets behavior and constraints. This is where you define output format, tone, and rules. Example:
You are a legal document analyzer. Extract key dates, parties, and obligations from contracts. Output valid JSON only with these exact fields: parties (array), effective_date (ISO 8601), termination_date (ISO 8601), obligations (array of objects with party and description). No explanatory text.
User message contains your actual input. If you're processing form data, structure it clearly:
Analyze this engagement letter:
[DOCUMENT TEXT]
Client name from form: `{{$json.client_name}}`
Service type: `{{$json.service_type}}`
Connect an upstream node (HTTP Request, Webhook{{$json.fieldname}}.
Step 3: Configure Output Parsing
Under "Options," enable "JSON Output" if you need structured data. This forces Gemini to return valid JSON and automatically parses it for downstream nodes.
Set "Max Tokens" based on your expected output length:
- Simple extraction: 500-1000 tokens
- Detailed analysis: 2000-4000 tokens
- Full document generation: 8000+ tokens
Never leave this unlimited. You'll waste money on runaway responses.
Enable "Stop Sequences" if you need to halt generation at specific markers. For example, if generating email drafts, add ---END--- as a stop sequence and include it in your system prompt.
Step 4: Handle the Response
The Gemini node outputs a message object. Access the content with {{$json.message.content}}.
If you enabled JSON output, the parsed object is directly available: {{$json.message.content.parties[0]}}.
Add an IF node immediately after Gemini to check for errors or unexpected formats:
`{{$json.message.content}}` is not empty
AND
`{{$json.message.content}}` does not contain "I cannot"
Route failures to a notification node or error handler. Never assume LLM
Step 5: Optimize for Cost and Speed
Batch requests when possible. Instead of calling Gemini once per item in a loop, collect 10-20 items and send them in a single prompt with clear delimiters.
Cache system prompts. If you're using the same instructions repeatedly, Gemini caches them automatically. Keep your system message consistent across calls.
Use Flash for preprocessing. Run cheap classification or filtering with gemini-1.5-flash first, then send only relevant items to gemini-1.5-pro for deep analysis.
Monitor token usage. Add a "Set" node after Gemini that logs {{$json.usage.total_tokens}} to a Google Sheet. Track costs weekly.
Real Implementation: Client Intake Processing
Here's a complete workflow that processes new client intake forms:
Node 1: Webhook receives form submission with fields: client_name, industry, service_requested, budget, timeline, description.
Node 2: Gemini (Flash) classifies urgency and fit:
System message:
Classify this intake request. Output JSON with: urgency (high/medium/low), service_match (exact/partial/none), estimated_hours (number), red_flags (array of strings or empty array).
User message:
Client: `{{$json.client_name}}`
Industry: `{{$json.industry}}`
Service: `{{$json.service_requested}}`
Budget: `{{$json.budget}}`
Timeline: `{{$json.timeline}}`
Description: `{{$json.description}}`
Node 3: IF checks if service_match is "exact" or "partial" AND urgency is "high" or "medium."
Node 4: Gemini (Pro) generates detailed intake summary and next steps (only for qualified leads):
System message:
You are an intake coordinator for a professional services firm. Create a detailed intake summary and recommended next steps. Output JSON with: summary (2-3 sentences), recommended_service_tier (standard/premium/custom), next_steps (array of specific actions), assigned_team (string), estimated_timeline (string).
Node 5: Google Sheets logs all results with timestamp, classification, and summary.
Node 6: Slack notifies the appropriate team channel with the summary and assignment.
This workflow processes 100+ intake forms per week, costs $12/month in API
Common Mistakes to Avoid
Sending unstructured prompts. "Analyze this document" produces useless output. Specify exactly what to extract and in what format.
Ignoring context limits. Gemini 1.5 Pro handles 1M tokens, but that doesn't mean you should send entire case files. Extract relevant sections first.
Not validating output. LLMs
Using high temperature for factual tasks. Temperature above 0.3 introduces randomness. For extraction and classification, stay at 0.0-0.2.
Forgetting to handle rate limits. Google enforces 60 requests per minute on standard tier. Add a "Wait" node (1 second) in loops to avoid failures.
Bottom Line
The Gemini node works best when you treat it like a specialized employee: give it clear instructions, structured input, and validate its work. Start with gemini-1.5-flash for simple tasks, upgrade to Pro only when you need deep reasoning. Monitor costs weekly and optimize prompts based on actual output quality, not theoretical capabilities.

Reviewed by Revenue Institute
This guide is actively maintained and reviewed by the implementation experts at Revenue Institute. As the creators of The AI Workforce Playbook, we test and deploy these exact frameworks for professional services firms scaling without new headcount.
Revenue Institute
Need help turning this guide into reality? Revenue Institute builds and implements the AI workforce for professional services firms.