Introducing Experimental Desktop Control

Weaszel includes an experimental desktop control feature that extends beyond browser automation. This opens up powerful new possibilities, but it comes with important caveats you need to understand.

⚠️ Important Notice

Desktop Control is an experimental feature. It can be unreliable and requires elevated macOS permissions. We recommend sticking with Browser Automation (the default) unless you have a specific need for desktop control.

What is Desktop Control?

Desktop Control allows Weaszel to interact with any application on your Mac—TextEdit, Finder, VS Code, you name it. Using macOS Accessibility APIs and AppleScript, the agent can:

Open and close applications
Click, type, and navigate within apps
Execute AppleScript commands for advanced automation
Take screenshots to understand the current state

✅ Pros: When Desktop Control Shines

✓
Local App Automation: Automate tasks in native macOS apps like Notes, TextEdit, or Finder
✓
Beyond the Browser: Handle workflows that span both web and desktop (e.g., download a file, then open it in Preview)
✓
AppleScript Power: Execute complex macOS automation scripts for advanced users
✓
Unified Agent: One AI agent for all your automation needs, not just web tasks

❌ Cons: The Reality Check

✗
Flaky and Unreliable: Desktop automation is inherently fragile. Apps steal focus, coordinates shift, and timing issues are common.
✗
Requires Permissions: You must grant Screen Recording and Accessibility permissions to your terminal, which some users may find invasive.
✗
Slower Than Browser: Desktop actions involve more overhead and are generally slower than browser automation.
✗
macOS Only: This feature is currently exclusive to macOS due to reliance on AppleScript and Accessibility APIs.
✗
Can Get Stuck: If the agent misidentifies the active window or an app doesn't respond as expected, it may loop or fail.

🛡️ Safety & Permissions

To use Desktop Control, you'll need to grant two macOS permissions to your terminal application (Terminal.app, iTerm, or VS Code):

Screen Recording: Allows Weaszel to see your screen and understand the current state
Accessibility: Allows Weaszel to control your mouse and keyboard

These are powerful permissions. Weaszel uses them responsibly, but you should be aware of what you're granting. The agent will guide you through the setup process on first run.

🎯 Our Recommendation

Stick with Browser Automation unless you have a specific need for desktop control.

Browser automation is:

More reliable and battle-tested
Faster and more efficient
Doesn't require invasive permissions
Covers 95% of automation use cases (job applications, research, shopping, etc.)

Desktop Control is there for the 5% of cases where you truly need to interact with local apps. Think of it as a power tool: incredibly useful in the right hands, but not something you reach for every day.

🚀 Getting Started

To enable Desktop Control, set EXPERIMENTAL_DESKTOP_ENABLED=true in your .env.local file.

# In .env.local

EXPERIMENTAL_DESKTOP_ENABLED=true

When you run Weaszel, it will automatically route tasks to either browser or desktop mode based on what you ask for.

🔮 The Future

Desktop Control is experimental, and we're actively working to improve its reliability. Future updates may include:

Better error recovery and retry logic
Smarter window focus management
Support for Windows and Linux (if there's demand)
Pre-built automation scripts for common desktop tasks

Your feedback is crucial! If you try Desktop Control, let us know what works, what doesn't, and what you'd like to see improved.

💡 Final Thoughts

Desktop Control is a glimpse into the future of AI agents—assistants that can truly work across your entire computer, not just the web. But it's early days. Use it wisely, report bugs, and help us make it better. 🦊

Questions or feedback? Reach out on GitHub.