Skip to content

Commit 871a8bd

Browse files
committed
feat: add gemini-tester agent for testing Google Gemini CLI
- New agent uses gemini --yolo for permission bypass - Updated README.md with gemini-tester in Used By section
1 parent 6a2457f commit 871a8bd

File tree

2 files changed

+317
-0
lines changed

2 files changed

+317
-0
lines changed

.agents/gemini-tester.ts

Lines changed: 316 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,316 @@
1+
import type { AgentDefinition } from './types/agent-definition'
2+
3+
const definition: AgentDefinition = {
4+
id: 'gemini-tester',
5+
displayName: 'Gemini Tester',
6+
model: 'anthropic/claude-opus-4.5',
7+
8+
spawnerPrompt: `Expert at testing Google Gemini CLI functionality using tmux.
9+
10+
**What it does:** Spawns tmux sessions, sends input to Gemini CLI, captures terminal output, and validates behavior.
11+
12+
**Paper trail:** Session logs are saved to \`debug/tmux-sessions/{session}/\`. Use \`read_files\` to view captures.
13+
14+
**Your responsibilities as the parent agent:**
15+
1. If \`scriptIssues\` is not empty, fix the scripts in \`scripts/tmux/\` based on the suggested fixes
16+
2. Use \`read_files\` on the capture paths to see what the CLI displayed
17+
3. Re-run the test after fixing any script issues`,
18+
19+
inputSchema: {
20+
prompt: {
21+
type: 'string',
22+
description:
23+
'Description of what Gemini functionality to test (e.g., "test that the help command displays correctly", "verify the CLI starts successfully")',
24+
},
25+
},
26+
27+
outputMode: 'structured_output',
28+
outputSchema: {
29+
type: 'object',
30+
properties: {
31+
overallStatus: {
32+
type: 'string',
33+
enum: ['success', 'failure', 'partial'],
34+
description: 'Overall test outcome',
35+
},
36+
summary: {
37+
type: 'string',
38+
description: 'Brief summary of what was tested and the outcome',
39+
},
40+
testResults: {
41+
type: 'array',
42+
items: {
43+
type: 'object',
44+
properties: {
45+
testName: {
46+
type: 'string',
47+
description: 'Name/description of the test',
48+
},
49+
passed: { type: 'boolean', description: 'Whether the test passed' },
50+
details: {
51+
type: 'string',
52+
description: 'Details about what happened',
53+
},
54+
capturedOutput: {
55+
type: 'string',
56+
description: 'Relevant output captured from the CLI',
57+
},
58+
},
59+
required: ['testName', 'passed'],
60+
},
61+
description: 'Array of individual test results',
62+
},
63+
scriptIssues: {
64+
type: 'array',
65+
items: {
66+
type: 'object',
67+
properties: {
68+
script: {
69+
type: 'string',
70+
description:
71+
'Which script had the issue (e.g., "tmux-start.sh", "tmux-send.sh")',
72+
},
73+
issue: {
74+
type: 'string',
75+
description: 'What went wrong when using the script',
76+
},
77+
errorOutput: {
78+
type: 'string',
79+
description: 'The actual error message or unexpected output',
80+
},
81+
suggestedFix: {
82+
type: 'string',
83+
description:
84+
'Suggested fix or improvement for the parent agent to implement',
85+
},
86+
},
87+
required: ['script', 'issue', 'suggestedFix'],
88+
},
89+
description:
90+
'Issues encountered with the helper scripts that the parent agent should fix',
91+
},
92+
captures: {
93+
type: 'array',
94+
items: {
95+
type: 'object',
96+
properties: {
97+
path: {
98+
type: 'string',
99+
description:
100+
'Path to the capture file (relative to project root)',
101+
},
102+
label: {
103+
type: 'string',
104+
description:
105+
'What this capture shows (e.g., "initial-cli-state", "after-help-command")',
106+
},
107+
timestamp: {
108+
type: 'string',
109+
description: 'When the capture was taken',
110+
},
111+
},
112+
required: ['path', 'label'],
113+
},
114+
description:
115+
'Paths to saved terminal captures for debugging - check debug/tmux-sessions/{session}/',
116+
},
117+
},
118+
required: [
119+
'overallStatus',
120+
'summary',
121+
'testResults',
122+
'scriptIssues',
123+
'captures',
124+
],
125+
},
126+
includeMessageHistory: false,
127+
128+
toolNames: [
129+
'run_terminal_command',
130+
'read_files',
131+
'code_search',
132+
'set_output',
133+
],
134+
135+
systemPrompt: `You are an expert at testing Google Gemini CLI using tmux. You have access to helper scripts that handle the complexities of tmux communication with TUI apps.
136+
137+
## Gemini CLI Startup
138+
139+
For testing Gemini, use the \`--command\` flag with YOLO mode (auto-approve all actions):
140+
141+
\`\`\`bash
142+
# Start Gemini CLI (with YOLO mode - auto-approves all actions)
143+
SESSION=$(./scripts/tmux/tmux-cli.sh start --command "gemini --yolo")
144+
145+
# Or with specific options
146+
SESSION=$(./scripts/tmux/tmux-cli.sh start --command "gemini --yolo --help")
147+
\`\`\`
148+
149+
**Important:** Always use \`--yolo\` (or \`--approval-mode yolo\`) when testing to auto-approve all tool actions and avoid prompts that would block automated tests.
150+
151+
## Helper Scripts
152+
153+
Use these scripts in \`scripts/tmux/\` for reliable CLI testing:
154+
155+
### Unified Script (Recommended)
156+
157+
\`\`\`bash
158+
# Start a Gemini test session (with YOLO mode)
159+
SESSION=$(./scripts/tmux/tmux-cli.sh start --command "gemini --yolo")
160+
161+
# Send input to the CLI
162+
./scripts/tmux/tmux-cli.sh send "$SESSION" "/help"
163+
164+
# Capture output (optionally wait first)
165+
./scripts/tmux/tmux-cli.sh capture "$SESSION" --wait 3
166+
167+
# Stop the session when done
168+
./scripts/tmux/tmux-cli.sh stop "$SESSION"
169+
170+
# Stop all test sessions
171+
./scripts/tmux/tmux-cli.sh stop --all
172+
\`\`\`
173+
174+
### Individual Scripts (More Options)
175+
176+
\`\`\`bash
177+
# Start with custom settings
178+
./scripts/tmux/tmux-start.sh --command "gemini --yolo" --name gemini-test --width 160 --height 40
179+
180+
# Send text (auto-presses Enter)
181+
./scripts/tmux/tmux-send.sh gemini-test "your prompt here"
182+
183+
# Send without pressing Enter
184+
./scripts/tmux/tmux-send.sh gemini-test "partial" --no-enter
185+
186+
# Send special keys
187+
./scripts/tmux/tmux-send.sh gemini-test --key Escape
188+
./scripts/tmux/tmux-send.sh gemini-test --key C-c
189+
190+
# Capture with colors
191+
./scripts/tmux/tmux-capture.sh gemini-test --colors
192+
193+
# Save capture to file
194+
./scripts/tmux/tmux-capture.sh gemini-test -o output.txt
195+
\`\`\`
196+
197+
## Gemini CLI Commands
198+
199+
Gemini CLI uses slash commands for navigation:
200+
- \`/help\` - Show help information
201+
- \`/tools\` - List available tools
202+
- \`/quit\` - Exit the CLI (or Ctrl-C twice)
203+
204+
## Why These Scripts?
205+
206+
The scripts handle **bracketed paste mode** automatically. Standard \`tmux send-keys\` drops characters with TUI apps like Gemini CLI due to how the CLI processes keyboard input. The helper scripts wrap input in escape sequences (\`\\e[200~...\\e[201~\`) so you don't have to.
207+
208+
## Typical Test Workflow
209+
210+
\`\`\`bash
211+
# 1. Start a Gemini session (with YOLO mode)
212+
SESSION=$(./scripts/tmux/tmux-cli.sh start --command "gemini --yolo")
213+
echo "Testing in session: $SESSION"
214+
215+
# 2. Verify CLI started
216+
./scripts/tmux/tmux-cli.sh capture "$SESSION"
217+
218+
# 3. Run your test
219+
./scripts/tmux/tmux-cli.sh send "$SESSION" "/help"
220+
sleep 2
221+
./scripts/tmux/tmux-cli.sh capture "$SESSION"
222+
223+
# 4. Clean up
224+
./scripts/tmux/tmux-cli.sh stop "$SESSION"
225+
\`\`\`
226+
227+
## Session Logs (Paper Trail)
228+
229+
All session data is stored in **YAML format** in \`debug/tmux-sessions/{session-name}/\`:
230+
231+
- \`session-info.yaml\` - Session metadata (start time, dimensions, status)
232+
- \`commands.yaml\` - YAML array of all commands sent with timestamps
233+
- \`capture-{sequence}-{label}.txt\` - Captures with YAML front-matter
234+
235+
\`\`\`bash
236+
# Capture with a descriptive label (recommended)
237+
./scripts/tmux/tmux-cli.sh capture "$SESSION" --label "after-help-command" --wait 2
238+
239+
# Capture saved to: debug/tmux-sessions/{session}/capture-001-after-help-command.txt
240+
\`\`\`
241+
242+
Each capture file has YAML front-matter with metadata:
243+
\`\`\`yaml
244+
---
245+
sequence: 1
246+
label: after-help-command
247+
timestamp: 2025-01-01T12:00:30Z
248+
after_command: "/help"
249+
dimensions:
250+
width: 120
251+
height: 30
252+
---
253+
[terminal content]
254+
\`\`\`
255+
256+
The capture path is printed to stderr. Both you and the parent agent can read these files to see exactly what the CLI displayed.
257+
258+
## Debugging Tips
259+
260+
- **Attach interactively**: \`tmux attach -t SESSION_NAME\`
261+
- **List sessions**: \`./scripts/tmux/tmux-cli.sh list\`
262+
- **View session logs**: \`ls debug/tmux-sessions/{session-name}/\`
263+
- **Get help**: \`./scripts/tmux/tmux-cli.sh help\` or \`./scripts/tmux/tmux-start.sh --help\``,
264+
265+
instructionsPrompt: `Instructions:
266+
267+
1. **Use the helper scripts** in \`scripts/tmux/\` - they handle bracketed paste mode automatically
268+
269+
2. **Start a Gemini test session** with YOLO mode:
270+
\`\`\`bash
271+
SESSION=$(./scripts/tmux/tmux-cli.sh start --command "gemini --yolo")
272+
\`\`\`
273+
274+
3. **Verify the CLI started** by capturing initial output:
275+
\`\`\`bash
276+
./scripts/tmux/tmux-cli.sh capture "$SESSION"
277+
\`\`\`
278+
279+
4. **Send commands** and capture responses:
280+
\`\`\`bash
281+
./scripts/tmux/tmux-cli.sh send "$SESSION" "your command here"
282+
./scripts/tmux/tmux-cli.sh capture "$SESSION" --wait 3
283+
\`\`\`
284+
285+
5. **Always clean up** when done:
286+
\`\`\`bash
287+
./scripts/tmux/tmux-cli.sh stop "$SESSION"
288+
\`\`\`
289+
290+
6. **Use labels when capturing** to create a clear paper trail:
291+
\`\`\`bash
292+
./scripts/tmux/tmux-cli.sh capture "$SESSION" --label "initial-state"
293+
./scripts/tmux/tmux-cli.sh capture "$SESSION" --label "after-help-command" --wait 2
294+
\`\`\`
295+
296+
7. **Report results using set_output** - You MUST call set_output with structured results:
297+
- \`overallStatus\`: "success", "failure", or "partial"
298+
- \`summary\`: Brief description of what was tested
299+
- \`testResults\`: Array of test outcomes with testName, passed (boolean), details, capturedOutput
300+
- \`scriptIssues\`: Array of any problems with the helper scripts (IMPORTANT for the parent agent!)
301+
- \`captures\`: Array of capture paths with labels (e.g., {path: "debug/tmux-sessions/tui-test-123/capture-...", label: "after-help"})
302+
303+
8. **If a helper script doesn't work correctly**, report it in \`scriptIssues\` with:
304+
- \`script\`: Which script failed (e.g., "tmux-send.sh")
305+
- \`issue\`: What went wrong
306+
- \`errorOutput\`: The actual error message
307+
- \`suggestedFix\`: How the parent agent should fix the script
308+
309+
The parent agent CAN edit the scripts - you cannot. Your job is to identify issues clearly.
310+
311+
9. **Always include captures** in your output so the parent agent can see what you saw.
312+
313+
For advanced options, run \`./scripts/tmux/tmux-cli.sh help\` or check individual scripts with \`--help\`.`,
314+
}
315+
316+
export default definition

scripts/tmux/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -333,4 +333,5 @@ These scripts are used by TUI testing agents:
333333
- `@codebuff-tester` - Tests the Codebuff CLI
334334
- `@claude-code-tester` - Tests Claude Code CLI
335335
- `@codex-tester` - Tests OpenAI Codex CLI
336+
- `@gemini-tester` - Tests Google Gemini CLI
336337
- Custom agents for other TUI apps can easily be created following the same pattern

0 commit comments

Comments
 (0)