Desktop Control: Power Meets Caution
Weaszel includes an experimental desktop control feature that extends beyond browser automation. This opens up powerful new possibilities, but it comes with important caveats you need to understand.
⚠️ Important Notice
Desktop Control is an experimental feature. It can be unreliable and requires elevated macOS permissions. We recommend sticking with Browser Automation (the default) unless you have a specific need for desktop control.
What is Desktop Control?
Desktop Control allows Weaszel to interact with any application on your Mac—TextEdit, Finder, VS Code, you name it. Using macOS Accessibility APIs and AppleScript, the agent can:
- Open and close applications
- Click, type, and navigate within apps
- Execute AppleScript commands for advanced automation
- Take screenshots to understand the current state
✅ Pros: When Desktop Control Shines
- ✓Local App Automation: Automate tasks in native macOS apps like Notes, TextEdit, or Finder
- ✓Beyond the Browser: Handle workflows that span both web and desktop (e.g., download a file, then open it in Preview)
- ✓AppleScript Power: Execute complex macOS automation scripts for advanced users
- ✓Unified Agent: One AI agent for all your automation needs, not just web tasks
❌ Cons: The Reality Check
- ✗Flaky and Unreliable: Desktop automation is inherently fragile. Apps steal focus, coordinates shift, and timing issues are common.
- ✗Requires Permissions: You must grant Screen Recording and Accessibility permissions to your terminal, which some users may find invasive.
- ✗Slower Than Browser: Desktop actions involve more overhead and are generally slower than browser automation.
- ✗macOS Only: This feature is currently exclusive to macOS due to reliance on AppleScript and Accessibility APIs.
- ✗Can Get Stuck: If the agent misidentifies the active window or an app doesn't respond as expected, it may loop or fail.
🛡️ Safety & Permissions
To use Desktop Control, you'll need to grant two macOS permissions to your terminal application (Terminal.app, iTerm, or VS Code):
- Screen Recording: Allows Weaszel to see your screen and understand the current state
- Accessibility: Allows Weaszel to control your mouse and keyboard
These are powerful permissions. Weaszel uses them responsibly, but you should be aware of what you're granting. The agent will guide you through the setup process on first run.
🎯 Our Recommendation
Stick with Browser Automation unless you have a specific need for desktop control.
Browser automation is:
- More reliable and battle-tested
- Faster and more efficient
- Doesn't require invasive permissions
- Covers 95% of automation use cases (job applications, research, shopping, etc.)
Desktop Control is there for the 5% of cases where you truly need to interact with local apps. Think of it as a power tool: incredibly useful in the right hands, but not something you reach for every day.
🚀 Getting Started
To enable Desktop Control, set EXPERIMENTAL_DESKTOP_ENABLED=true in your .env.local file.
# In .env.local
EXPERIMENTAL_DESKTOP_ENABLED=true
When you run Weaszel, it will automatically route tasks to either browser or desktop mode based on what you ask for.
🔮 The Future
Desktop Control is experimental, and we're actively working to improve its reliability. Future updates may include:
- Better error recovery and retry logic
- Smarter window focus management
- Support for Windows and Linux (if there's demand)
- Pre-built automation scripts for common desktop tasks
Your feedback is crucial! If you try Desktop Control, let us know what works, what doesn't, and what you'd like to see improved.
💡 Final Thoughts
Desktop Control is a glimpse into the future of AI agents—assistants that can truly work across your entire computer, not just the web. But it's early days. Use it wisely, report bugs, and help us make it better. 🦊
Questions or feedback? Reach out on GitHub.