An AI teaching companion that sees your screen, talks to you, and physically points at UI elements — like having an expert tutor sitting next to you.
Ships as a macOS menu bar app, an embeddable web widget (@skilly/web), and a Studio dashboard for site owners. Each surface shares the same teaching skill format, voice pipeline, and billing layer.
tryskilly.app | Download macOS app | Web SDK CDN
Now running on OpenAI's GA Realtime API (gpt-realtime). The entire voice pipeline moved off the old beta endpoint onto OpenAI's generally-available Realtime stack — lower latency and a noticeably more natural voice.
Bring your own OpenAI API key. Paste your sk-proj-... key in Settings → Account → API Key, hit "Verify & save," and the 15-minute trial gating disappears. You're billed by OpenAI directly (~$0.06–0.10 per minute of voice). No card on file with us, no subscription required.
If you'd rather skip key management, the hosted $19/month tier still exists. Both paths use the same app.
Lives in the menu bar (LSUIElement=true, no dock icon). Push-to-talk (ctrl + option) or Live Tutor mode captures your screen and mic, streams them to OpenAI Realtime, and animates a blue cursor to whatever UI element the AI references.
A ~24KB vanilla-TS Shadow DOM widget site owners drop into any web page via one <script> tag. No framework dependency. The companion sees the page structure (DOM digest — no screenshots), speaks to the user via WebRTC, and points to elements with bezier-arc cursor animation.
skilly-web-demo-trimmed.mp4
<script src="https://cdn.tryskilly.app/web/v1.js"
data-skilly-key="pk_live_..."
data-skilly-skill="your-skill-id"
data-skilly-backend-url="https://your-backend.example.com"
defer></script>Or via npm:
bun add @skilly/web # ESM buildSelf-serve control plane at tryskilly.app. Manage API keys, origin allowlists, SKILL.md content, usage metrics, and billing — all in one place. Built with Next.js App Router + Tailwind v4.
- Screen-aware tutoring — sees your screen (macOS) or DOM (web) and references specific UI elements by name
- Cursor pointing — blue cursor animates to elements via bezier arc on any connected monitor (macOS) or anywhere on the page (web)
- Teaching skills —
SKILL.mdfiles with curriculum stages, completion signals, and UI vocabulary; domain expertise layered into the system prompt - 5 bundled skills — Blender, After Effects, Premiere Pro, DaVinci Resolve, Figma
- Live Tutor mode — always-on listening with server-side VAD (no hotkey needed)
- Multi-monitor — captures and points across all connected displays (macOS)
- Bring Your Own Key — bypass billing by supplying your own OpenAI API key
- 15-minute free trial — no card required, then $19/month for 3 hours of tutoring
Hi Claude.
Clone https://github.com/tryskilly/skilly.git into my current directory.
Then read the CLAUDE.md. I want to get Skilly running locally on my Mac.
Help me set up everything — the Cloudflare Worker with my own API keys, the proxy URLs, and getting it building in Xcode. Walk me through it.
- macOS 14.2+ (for ScreenCaptureKit)
- Xcode 15+
- Node.js 18+ / Bun 1.x (for the Worker and Studio backend)
- A Cloudflare account (free tier works)
- API keys for: OpenAI and WorkOS
The Worker is a proxy that holds your API keys and handles auth, billing, and API routing. Nothing sensitive ships in the app binary.
cd worker
npm installAdd your secrets:
npx wrangler secret put OPENAI_API_KEY
npx wrangler secret put WORKOS_API_KEY
npx wrangler secret put SESSION_TOKEN_SECRET
npx wrangler secret put POLAR_ACCESS_TOKEN
npx wrangler secret put POLAR_WEBHOOK_SECRETConfigure wrangler.toml with your WorkOS and Polar settings, then deploy:
npx wrangler deployopen leanring-buddy.xcodeproj- Select the
leanring-buddyscheme (the typo is intentional/legacy) - Set your signing team under Signing & Capabilities
- Hit Cmd + R to build and run
The app appears in your menu bar (not the dock). Click the icon to open the panel.
Do NOT run xcodebuild from the terminal — it invalidates TCC permissions and the app will need to re-request screen recording, accessibility, etc.
cd apps/web-backend
bun install
cp .env.example .env.local # fill in your keys
bun run devSee apps/web-backend/.env.example for all required environment variables.
- Microphone — push-to-talk voice capture
- Accessibility — global keyboard shortcut
- Screen Recording — screenshots when you talk
macOS app ─────────────────────────────────────────────────────┐
Push-to-talk / Live Tutor │
OpenAI Realtime WebSocket (audio + vision + TTS) │
Blue cursor overlay (bezier arc, all monitors) │
Skill system (SKILL.md → 5-layer system prompt) │
│
@skilly/web widget ─────────────────────────────────────────── │ ─── Cloudflare Worker proxy
Shadow DOM embed (~24KB IIFE) │ /openai/token
DOM digest (screenshot-free page view) │ /auth/*
OpenAI Realtime over WebRTC (mic up, voice down) │ /entitlement
Bezier-arc cursor pointing (data-skilly / CSS / text) │ /webhooks/polar
│
Studio dashboard (apps/web-backend) ──────────────────────────┘
Next.js App Router + Tailwind v4 + Postgres
Multi-tenant: keys, origins, skills, usage, billing (Polar)
WorkOS AuthKit (Google SSO + magic link)
The macOS app and web widget share the same Cloudflare Worker for token relay and the same SKILL.md teaching format. A shared Rust core (core/) provides deterministic policy, realtime state-machine, and prompt-composition logic compiled as a C ABI dylib (Swift bridges) and WASM (browser widget).
leanring-buddy/ # macOS Swift source (the typo stays)
CompanionManager.swift # Central state machine + voice pipeline
OpenAIRealtimeClient.swift # OpenAI Realtime WebSocket client
SkillManager.swift # Skill loading, activation, curriculum
SkillPromptComposer.swift # 5-layer system prompt composition
AuthManager.swift # WorkOS auth + Keychain session
EntitlementManager.swift # Billing entitlement + trial tracking
OverlayWindow.swift # Blue cursor overlay + pointing animation
RustPolicyBridge.swift # Swift ↔ Rust FFI (policy, skills, realtime)
worker/ # Cloudflare Worker API proxy
src/index.ts # Auth, billing, OpenAI token relay
apps/
web-backend/ # Studio dashboard (Next.js + Tailwind v4)
sdk/
web/ # @skilly/web embed widget source (TypeScript)
src/index.ts # Public API + SkillyController
src/realtime.ts # OpenAI Realtime over WebRTC
src/pointing.ts # DOM digest + bezier-arc cursor engine
ios/companion/ # iOS SkillyCompanion Swift Package
android/ # Android SDK (Kotlin, UniFFI)
core/ # Shared Rust workspace
policy/ # Deterministic entitlement/trial/cap engine
skills/ # Prompt composition (curriculum + vocabulary)
realtime/ # Realtime turn/session state machine
ffi/ # C ABI boundary for Swift bridges
web-sdk/ # WASM bindings for the browser widget
skills/ # Bundled teaching skills
blender-fundamentals/ # Blender 4.x (6 curriculum stages)
after-effects-basics/ # Adobe After Effects
premiere-pro-basics/ # Adobe Premiere Pro
davinci-resolve-basics/ # DaVinci Resolve
figma-basics/ # Figma
scripts/ # Build + release automation
build-web-cdn.sh # Builds @skilly/web → sdk/web-cdn-public/
release.sh # macOS DMG + notarization + GitHub Release
docs/ # Architecture docs, PRDs, analytics tracking plan
Skills are SKILL.md files that give Skilly domain expertise. The 5 bundled skills are seeded automatically on first launch. To install a custom skill, drop its directory in ~/.skilly/skills/:
mkdir -p ~/.skilly/skills
cp -r skills/blender-fundamentals ~/.skilly/skills/Skills auto-activate when you open their target app (e.g., the Blender skill activates when Blender is frontmost). You can also drag a skill folder onto the Skilly panel or manage skills via the per-row overflow menu.
Skilly is a fork of farzaa/clicky — Farza's open-source AI cursor-buddy. The macOS foundation (ScreenCaptureKit integration, app focus tracking, accessibility hooks) is built on his work.
What Skilly adds on top:
- Live Tutor mode — continuous voice-gated conversation, not just push-to-talk
- Single-call OpenAI Realtime pipeline — one WebSocket replaces the old TTS + STT + LLM chain
- Pluggable Skills layer —
SKILL.mdfiles with curriculum stages, completion signals, and UI vocabulary @skilly/webembed — bring the companion to any website via a<script>tag- Studio dashboard — self-serve multi-tenant control plane for site owners
- Shared Rust core — deterministic policy + prompt logic compiled for desktop, mobile, and WASM
Thanks Farza for shipping it open-source.
This is a fork of farzaa/clicky (MIT License). Original upstream code is attributed in THIRD_PARTY_NOTICES.md. Skilly-specific additions are marked with // MARK: - Skilly throughout the codebase.
- Vulnerability reporting:
SECURITY.md - Third-party notices:
THIRD_PARTY_NOTICES.md - Privacy data inventory:
docs/privacy-data-inventory.md
Copyright 2026 Mohamed Saleh Zaied (moelabs.dev)
Licensed under the Apache License, Version 2.0. See LICENSE for details.
Original Clicky code by Farza, licensed under MIT.
