MiniMe Technical Whitepaper

A privacy-first architecture for extracting semantic knowledge graphs from active window telemetry using on-device Large Language Models.

Version 1.0February 2026

Abstract

Knowledge workers generate thousands of fragmented data points daily across dozens of applications. Existing time-tracking solutions rely on manual entry or invasive cloud-based surveillance. MiniMe proposes a novel architecture that captures telemetry locally, utilizes a quantized Llama-3 model running on-device for entity recognition (NER), and builds a semantic Personal Knowledge Graph (PKG). This approach guarantees absolute data sovereignty while providing enterprise-grade analytics.

1. System Architecture

The MiniMe system is composed of three primary decoupled layers:

  • The Sensor Daemon: A lightweight Rust binary natively integrated with macOS (Accessibility API), Windows (User32), and Linux (X11/Wayland). It polls active window state at 1Hz, buffering changes to avoid disk thrashing.
  • The Local Backend: A Python/FastAPI service managing a local SQLite database. It serves as the primary router for incoming telemetry and outgoing UI queries.
  • The AI Engine: An integration with Ollama to run 8B parameter models. The engine executes batch inference jobs during idle periods to enrich raw telemetry with semantic metadata.

2. The Knowledge Graph (PKG)

Unlike relational time-logs, MiniMe structures data as an RDF-style graph. Nodes represent Software Applications, Projects, Skills, and People. Edges represent temporal interactions (e.g., [User] USED [VS Code] FOR [2 hours] ON [minime-backend]). This enables complex Cypher-like queries directly against the user's history.

3. Privacy & Security Model

The foundational invariant of MiniMe is Zero Trust Cloud. The local node is the primary source of truth.

When the user opts into Cloud Sync (for cross-device experiences or Enterprise team aggregation), the local node encrypts the PKG using AES-256-GCM derived from a user-held master password. The server stores only ciphertext blobs. The cloud backend cannot construct analytics; it acts strictly as a dumb storage layer (E2EE).

4. Performance Benchmarks

In our tests on an Apple M3 Mac with 16GB RAM:

  • Sensor Daemon CPU usage: `< 0.5%`
  • SQLite write latency: 2ms (via WAL mode WAL and synchronous=NORMAL)
  • Local LLM Inference (Batch of 50 activities): 4.2 seconds