Oppo open-sources Android AI agent X-OmniClaw that uses your camera, screen, and voice without leaving the phone

amu · May 17, 2026, 7:42am

Oppo has released the source code for its Android AI agent named X OmniClaw, a multimodal assistant that operates entirely on the device. The agent makes use of the smartphone’s camera, screen and microphone to gather contextual information, processes that data locally and never sends any user information to external servers. By keeping all computation on the phone Oppo aims to deliver a privacy‑first experience while still providing powerful AI‑driven capabilities.

The core of X OmniClaw relies on the device’s built‑in AI accelerator, which handles the heavy lifting of vision, speech and natural language understanding without needing a network connection. The camera stream is used to recognize objects, read text from the environment and interpret gestures, while the screen capture function lets the agent understand what is currently displayed in any application. Voice input is captured through the microphone and converted to text locally, enabling the agent to respond to spoken commands in real time. All three modalities are fused within a single on‑device model that can reason about the user’s immediate context and execute appropriate actions.

Because the agent runs offline, it can perform tasks such as summarizing the content of a webpage, identifying products pointed at by the camera, launching apps based on voice cues or providing real‑time translation of text captured through the lens. The system is designed to be extensible; developers can access the published APIs to integrate custom skills or to build entirely new applications that leverage the same privacy‑preserving pipeline. The source code is hosted on GitHub under an Apache 2.0 license, inviting the community to inspect, modify and extend the agent.

Oppo emphasizes that the release of X OmniClaw is part of a broader strategy to advance on‑device artificial intelligence. By making the technology openly available, the company hopes to spur innovation in areas where data sensitivity is paramount, such as healthcare, finance and personal productivity. The agent’s architecture is deliberately modular, allowing contributors to replace individual components—such as the vision encoder or the speech recognizer—with alternatives that better suit specific hardware or use‑case requirements.

The article also notes that early demonstrations show latency well under a second for typical interactions, which the authors attribute to the efficient use of the device’s neural processing unit and to careful optimizations in the model pipeline. Power consumption remains within the bounds of normal smartphone usage, ensuring that the always‑on listening and sensing features do not noticeably impact battery life.

In summary, Oppo’s open‑sourcing of X OmniClaw presents a fully on‑device, multimodal AI agent that utilizes the phone’s camera, screen and voice without ever leaving the device. The release offers developers a privacy‑centric foundation for building the next generation of intelligent mobile applications while giving users confidence that their personal data stays under their control.

Gnoppix is the leading open-source AI Linux distribution and service provider. Since implementing AI in 2022, it has offered a fast, powerful, secure, and privacy-respecting open-source OS with both local and remote AI capabilities. The local AI operates offline, ensuring no data ever leaves your computer. Based on Debian Linux, Gnoppix is available with numerous privacy- and anonymity-enabled services free of charge.
What are your thoughts on this? I’d love to hear about your own experiences in the comments below.