midi-cli-rs: Music Generation for AI Coding Agents
1063 words • 6 min read • Abstract

AI coding agents can write code, generate images, and produce text. But what about music? When I needed background audio for explainer videos, I wanted a tool that AI agents could use directly—no music theory required.
| Resource | Link |
|---|---|
| Video | midi-cli-rs Explainer![]() |
| Examples | Listen to Samples |
| Code | midi-cli-rs |
The Problem
Generating music programmatically is hard. Traditional approaches require understanding music theory, MIDI specifications, instrument mappings, and audio synthesis. That’s a lot to ask of an AI agent that just needs a 5-second intro.
I wanted something simpler: a CLI tool where an agent could say “give me 5 seconds of suspenseful music” and get a usable WAV file.
The Solution: Mood Presets
midi-cli-rs solves this with mood presets—curated musical generators that produce complete compositions from a single command:
# Generate a 5-second suspenseful intro
midi-cli-rs preset --mood suspense --duration 5 -o intro.wav
# Upbeat outro with specific key
midi-cli-rs preset -m upbeat -d 7 --key C --seed 42 -o outro.wav
Six moods are available:
| Mood | Character |
|---|---|
suspense |
Low drones, tremolo strings, tension |
eerie |
Sparse tones, diminished harmony |
upbeat |
Rhythmic chords, energetic |
calm |
Warm pads, gentle arpeggios |
ambient |
Textural drones, pentatonic bells |
jazz |
Walking bass, brushed drums, piano trio |
Each mood generates multi-layer compositions with appropriate instruments, rhythms, and harmonies. The --seed parameter ensures reproducibility—same seed, same output. Different seeds produce meaningful variations in melody contour, rhythm patterns, and instrument choices.
Melodic Variation
The presets don’t just randomize notes—they use a contour-based variation system. Changing the seed produces melodies that follow different shapes (ascending, descending, arch, wave) while staying musically coherent. This means you can generate multiple versions of a mood and pick the one that fits best.
How It Works
The tool generates MIDI programmatically, then renders to WAV using FluidSynth:
Mood Preset → MIDI Generation → FluidSynth → WAV Output
MIDI generation uses the midly crate to create standard MIDI files. Each preset generates multiple tracks with different instruments, note patterns, and dynamics.
Audio rendering calls FluidSynth as a subprocess with a SoundFont (instrument samples). This avoids LGPL licensing complications—subprocess execution doesn’t trigger copyleft.
Note-Level Control
When presets aren’t enough, you can specify exact notes:
# Note format: PITCH:DURATION:VELOCITY[@OFFSET]
midi-cli-rs generate \
--notes "C4:0.5:80@0,E4:0.5:80@0.5,G4:0.5:80@1,C5:1:90@1.5" \
-i piano -t 120 -o arpeggio.wav
Or use JSON for complex multi-track arrangements:
echo '{"tempo":90,"instrument":"piano","notes":[
{"pitch":"C4","duration":0.5,"velocity":80,"offset":0},
{"pitch":"E4","duration":0.5,"velocity":80,"offset":0.5},
{"pitch":"G4","duration":1,"velocity":90,"offset":1}
]}' | midi-cli-rs generate --json -o output.wav
Web UI
For interactive composition, there’s a browser-based interface:
midi-cli-rs serve # Starts on http://127.0.0.1:3105
The Presets tab lets you adjust mood, key, duration, intensity, and tempo with immediate audio preview. Click the clock button to generate a time-based seed for unique but reproducible results.
The Melodies tab provides note-by-note composition with keyboard shortcuts:
a-gfor note pitch[/]to adjust duration+/-to change octaveTabto navigate between notes
For AI Agents
The CLI is designed for AI agent usage:
- Simple commands: One line generates complete audio
- Reproducible: Seed values ensure consistent output
- Self-documenting:
--helpincludes agent-specific instructions - Composable: Generate tracks separately, combine with ffmpeg
# AI agent workflow
midi-cli-rs preset -m suspense -d 5 --seed 1 -o intro.wav
midi-cli-rs preset -m upbeat -d 10 --seed 2 -o main.wav
ffmpeg -i intro.wav -i main.wav -filter_complex concat=n=2:v=0:a=1 final.wav
SoundFont Quality Matters
The quality of generated audio depends heavily on the SoundFont used. SoundFonts are collections of audio samples for each instrument—a tiny SoundFont with compressed samples will sound thin and artificial, while a larger one with high-quality recordings produces professional results.
| SoundFont | Size | Quality | License |
|---|---|---|---|
| TimGM6mb | ~6MB | Basic | GPL v2 |
| GeneralUser GS | ~30MB | Good | Permissive |
| FluidR3_GM | ~140MB | Very Good | MIT |
| MuseScore_General | ~200MB | Excellent | MIT |
For anything beyond quick prototypes, use a quality SoundFont. The difference is dramatic—the same MIDI file can sound like a toy keyboard or a real instrument depending on the samples.
The tool auto-detects SoundFonts in common locations (~/.soundfonts/, /opt/homebrew/share/soundfonts/, etc.), or specify one explicitly with --soundfont.
Technical Details
Built with Rust 2024 edition using permissively licensed dependencies:
| Crate | Purpose |
|---|---|
| midly | MIDI file generation |
| clap | CLI argument parsing |
| serde | JSON serialization |
| rand | Randomization for presets |
| axum | Web server (for serve command) |
FluidSynth is called as a subprocess for WAV rendering, keeping the main codebase MIT-licensed.
Try It
Listen to sample outputs, or build locally:
git clone https://github.com/softwarewrighter/midi-cli-rs.git
cd midi-cli-rs
cargo build --release
./target/release/midi-cli-rs preset -m jazz -d 5 -o jazz.wav
Requires FluidSynth for WAV output (brew install fluid-synth on macOS).
| Series: Personal Software (Part 2) | Previous: cat-finder: Local ML in Rust | Next: midi-cli-rs: Custom Mood Packs |
Disclaimer
You are responsible for how you use generated audio. Ensure you have the appropriate rights and permissions for any commercial or public use. This tool generates MIDI data algorithmically—how you render and distribute the final audio is your responsibility.
Be aware that algorithmic composition can inadvertently produce sequences similar to existing copyrighted works. Whether you use this tool, AI generation, or compose by hand, you must verify that your output doesn’t infringe on existing copyrights before public release or commercial use. Protect yourself legally.
Part 2 of the Personal Software series. View all parts | Next: Part 3 →
