Skip to content

dotancohen/voicecore

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

64 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VoiceCore: The Rust Core Library for Voice

  • Purpose: VoiceCore provides the foundational functionality for Voice, a note-taking application with hierarchical tags and peer-to-peer synchronization.
  • Architecture: Pure Rust library designed for integration with Python bindings (desktop/server) and native Android applications.
  • No Dependencies on System Libraries: Uses bundled SQLite and pure-Rust TLS, eliminating the need for OpenSSL or other native dependencies.

Features

  • Note Management: Full CRUD operations with soft-delete semantics for sync compatibility.
  • Hierarchical Tags: Tree-structured tag system with parent-child relationships.
  • Full-Text Search: Search notes by content and/or tag filters, with support for hierarchical tag paths.
  • Peer-to-Peer Sync: Bidirectional synchronization protocol with multiple devices.
  • Conflict Resolution: Automatic conflict detection with manual resolution strategies.
  • Configuration Management: Device identity, peer management, and theme settings.
  • TLS/TOFU Security: Self-signed certificate generation with Trust-On-First-Use verification.

Architecture

src/
├── lib.rs              # Library entry point and public API re-exports
├── models.rs           # Core data structures (Note, Tag, NoteTag)
├── database.rs         # SQLite data access layer
├── error.rs            # Error types and handling
├── validation.rs       # Input validation utilities
├── config.rs           # Configuration management
├── sync_client.rs      # Peer-to-peer sync client
├── sync_server.rs      # Sync server (Axum-based)
├── conflicts.rs        # Conflict resolution engine
├── merge.rs            # Text merging algorithms
├── search.rs           # Search and filtering
└── tls.rs              # TLS/certificate management

Module Overview

Module Purpose
models Core data structures: Note, Tag, NoteTag with UUID7 identifiers
database SQLite persistence with comprehensive CRUD and query operations
error VoiceError enum and ValidationError for detailed error handling
validation UUID, datetime, tag path, and content validation utilities
config JSON-based configuration with device identity and peer management
sync_client Async HTTP client for pulling/pushing changes to peers
sync_server Axum-based REST server for receiving sync requests
conflicts Detection and resolution of content, delete, and rename conflicts
merge Line-by-line diff and 3-way merge algorithms
search Parser for combined tag and text search queries
tls Self-signed certificate generation and TOFU verification

Requirements

  • Rust 1.70 or higher (2021 edition)
  • No external system dependencies (SQLite is bundled)

Building

As a Standalone Library

cargo build --release

Running Tests

cargo test

Building Documentation

cargo doc --open

Usage

Basic Note Operations

use voicecore::{Database, Config};

// Initialize with default config directory (~/.config/voice)
let config = Config::new(None)?;
let mut db = Database::new(config.database_file())?;

// Create a note
let note_id = db.create_note("My first note")?;

// Update note content
db.update_note(&note_id, "Updated content")?;

// Get a note
let note = db.get_note(&note_id)?;
println!("Note content: {}", note.content);

// List all notes
let notes = db.get_notes()?;

// Delete a note (soft-delete for sync)
db.delete_note(&note_id)?;

Tag Operations

// Create a root tag
let work_id = db.create_tag("Work", None)?;

// Create a child tag
let projects_id = db.create_tag("Projects", Some(&work_id))?;

// Associate a tag with a note
db.add_note_tag(&note_id, &work_id)?;

// Get all tags
let tags = db.get_tags()?;

// Get tags in hierarchical order
let tree = db.get_tags_hierarchical()?;

Search

// Search by text content
let results = db.search_notes(Some("meeting notes"), None)?;

// Search by tag
let results = db.search_notes(None, Some(vec!["Work"]))?;

// Combined search (text AND tag)
let results = db.search_notes(Some("quarterly"), Some(vec!["Work", "Reports"]))?;

// Search with hierarchical tag path
let results = db.search_notes(None, Some(vec!["Europe/France/Paris"]))?;

Synchronization

use voicecore::{SyncClient, SyncServer};
use std::sync::{Arc, Mutex};

// Initialize sync client
let db = Arc::new(Mutex::new(Database::new(&db_path)?));
let config = Arc::new(Mutex::new(Config::new(None)?));
let sync_client = SyncClient::new(db.clone(), config.clone())?;

// Sync with a peer
let result = sync_client.sync_with_peer("peer_device_id").await?;
println!("Pulled: {}, Pushed: {}, Conflicts: {}",
    result.pulled, result.pushed, result.conflicts);

// Start sync server
let server = SyncServer::new(db, config);
server.start("0.0.0.0", 8384).await?;

Conflict Resolution

use voicecore::{ResolutionChoice};

// List pending conflicts
let conflicts = db.get_note_content_conflicts()?;

for conflict in conflicts {
    println!("Conflict on note {}: local vs remote", conflict.note_id);
    println!("Local: {}", conflict.local_content);
    println!("Remote: {}", conflict.remote_content);

    // Resolve by keeping local version
    db.resolve_note_content_conflict(&conflict.id, ResolutionChoice::KeepLocal)?;

    // Or keep remote
    db.resolve_note_content_conflict(&conflict.id, ResolutionChoice::KeepRemote)?;

    // Or merge (manual editing required)
    db.resolve_note_content_conflict(&conflict.id, ResolutionChoice::Merge)?;
}

Configuration

// Load or create config
let mut config = Config::new(Some("/path/to/config/dir"))?;

// Get device identity
println!("Device ID: {}", config.device_id());
println!("Device Name: {}", config.device_name());

// Add a sync peer
config.add_peer(
    "a1b2c3d4e5f67890",  // peer device ID
    "HomeServer",         // peer name
    "https://sync.example.com"  // peer URL
)?;

// List configured peers
let peers = config.get_peers();

// Remove a peer
config.remove_peer("a1b2c3d4e5f67890")?;

API Reference

Core Types

Note

pub struct Note {
    pub id: Uuid,                           // UUID7 identifier
    pub created_at: DateTime<Utc>,          // Creation timestamp
    pub content: String,                    // Note content (max 100KB)
    pub device_id: Uuid,                    // Creating device ID
    pub modified_at: Option<DateTime<Utc>>, // Last modification
    pub deleted_at: Option<DateTime<Utc>>,  // Soft-delete timestamp
}

Tag

pub struct Tag {
    pub id: Uuid,                           // UUID7 identifier
    pub name: String,                       // Tag name (max 255 chars)
    pub device_id: Uuid,                    // Creating device ID
    pub parent_id: Option<Uuid>,            // Parent tag for hierarchy
    pub created_at: Option<DateTime<Utc>>,  // Creation timestamp
    pub modified_at: Option<DateTime<Utc>>, // Last modification
}

NoteTag

pub struct NoteTag {
    pub note_id: Uuid,                      // Associated note
    pub tag_id: Uuid,                       // Associated tag
    pub created_at: DateTime<Utc>,          // Association creation
    pub device_id: Uuid,                    // Creating device ID
    pub modified_at: Option<DateTime<Utc>>, // Last modification
    pub deleted_at: Option<DateTime<Utc>>,  // Soft-delete timestamp
}

Error Types

pub enum VoiceError {
    Validation(ValidationError),  // Input validation failures
    Database(String),             // SQLite errors
    Sync(String),                 // Synchronization errors
    Network(String),              // HTTP/connection errors
    Tls(String),                  // Certificate errors
    Config(String),               // Configuration errors
    NotFound(String),             // Entity not found
    Conflict(String),             // Sync conflicts
}

pub enum ValidationError {
    InvalidUuid(String),
    InvalidDatetime(String),
    InvalidTagName(String),
    InvalidTagPath(String),
    ContentTooLong(usize),
    EmptyContent,
    // ... additional variants
}

Sync Types

pub struct SyncResult {
    pub success: bool,
    pub pulled: i64,      // Changes received from peer
    pub pushed: i64,      // Changes sent to peer
    pub conflicts: i64,   // Conflicts detected
    pub errors: Vec<String>,
}

pub enum ResolutionChoice {
    KeepLocal,   // Use local version
    KeepRemote,  // Use remote version
    Merge,       // Manual merge (with conflict markers)
    KeepBoth,    // For delete conflicts: restore deleted note
}

Database Schema

VoiceCore uses SQLite with UUID7 as BLOB primary keys.

Core Tables

Table Purpose
notes Note content with timestamps and soft-delete
tags Hierarchical tag definitions
note_tags Many-to-many note-tag associations

Sync Infrastructure

Table Purpose
sync_peers Configured peer devices and last sync times
sync_failures Failed sync operations for retry
conflicts_note_content Content conflicts awaiting resolution
conflicts_note_delete Delete vs. edit conflicts
conflicts_tag_rename Tag rename conflicts

Schema Versioning

The database includes a schema_version table for migrations. Current schema version: 1.

Sync Protocol

This section documents the sync protocol for implementing new clients (e.g., mobile apps, web clients).

Protocol Overview

VoiceCore uses a bidirectional sync protocol where any device can act as both client and server. The protocol supports:

  • Incremental sync (only changes since last sync)
  • Full sync (complete dataset transfer for initial sync or recovery)
  • Audio file transfer (binary content via separate endpoints)
  • Conflict detection and resolution

Endpoints

Method Path Description
POST /sync/handshake Device discovery and identity exchange
GET /sync/changes?since=<timestamp>&limit=<n> Pull changes since timestamp
POST /sync/apply Apply remote changes to local database
GET /sync/full Full dataset for initial sync
GET /sync/status Health check and server info
GET /sync/audio/<audio_id> Download audio file binary content
POST /sync/audio/<audio_id> Upload audio file binary content

Endpoint Details

POST /sync/handshake

Exchange device identities and determine last sync timestamp.

Request:

{
    "device_id": "018d1234abcd5678...",
    "device_name": "My Android Phone",
    "protocol_version": "1.0"
}

Response:

{
    "device_id": "018d5678efgh9012...",
    "device_name": "Home Server",
    "protocol_version": "1.0",
    "last_sync_timestamp": "2024-01-15 10:30:00",
    "server_timestamp": "2024-01-15 12:00:00",
    "supports_audiofiles": true
}

GET /sync/changes

Pull changes since a given timestamp.

Query Parameters:

  • since (optional): ISO timestamp YYYY-MM-DD HH:MM:SS. If omitted, returns all changes.
  • limit (optional): Maximum number of changes to return. Default: 1000.

Response:

{
    "changes": [ /* array of SyncChange objects */ ],
    "from_timestamp": "2024-01-15 10:30:00",
    "to_timestamp": "2024-01-15 12:00:00",
    "device_id": "018d5678efgh9012...",
    "device_name": "Home Server",
    "is_complete": true
}

If is_complete is false, there are more changes available. Call again with updated since parameter.

POST /sync/apply

Apply changes from another device.

Request:

{
    "device_id": "018d1234abcd5678...",
    "device_name": "My Android Phone",
    "changes": [ /* array of SyncChange objects */ ]
}

Response:

{
    "applied": 15,
    "conflicts": 2,
    "errors": ["Error applying note abc123: validation failed"]
}

GET /sync/full

Get complete dataset for initial sync.

Response: Same format as /sync/changes but includes all data regardless of timestamps.

GET /sync/status

Health check endpoint.

Response:

{
    "device_id": "018d5678efgh9012...",
    "device_name": "Home Server",
    "protocol_version": "1.0",
    "status": "ok",
    "supports_audiofiles": true
}

Entity Types

The sync protocol supports these entity types:

Entity Type Description Dependencies
note Note content and metadata None
tag Tag definitions with hierarchy None (parent_id is self-referential)
audio_file Audio file metadata (not content) None
note_tag Note-to-tag associations Requires note, tag
note_attachment Note-to-attachment associations Requires note, audio_file
transcription Audio transcription text Requires audio_file

Dependency Order: When applying changes, process entities in dependency order:

  1. First: note, tag, audio_file (no dependencies)
  2. Then: note_tag, note_attachment, transcription (depend on entities from step 1)

Change Format

{
    "entity_type": "note",
    "entity_id": "018d1234abcd5678901234567890abcd",
    "operation": "create",
    "data": {
        "content": "Meeting notes from today...",
        "created_at": "2024-01-15 10:30:00",
        "modified_at": "2024-01-15 10:30:00",
        "device_id": "018d5678efgh90123456789012345678"
    },
    "timestamp": "2024-01-15 10:30:00",
    "device_id": "018d5678efgh90123456789012345678"
}

Fields:

  • entity_type: One of the entity types listed above
  • entity_id: UUID7 hex string (32 characters, no hyphens)
  • operation: create, update, or delete
  • data: Entity-specific data (see below)
  • timestamp: When the change occurred (server's modified_at)
  • device_id: Device that made the change

Entity Data Formats

Note

{
    "content": "Note text content",
    "created_at": "2024-01-15 10:30:00",
    "modified_at": "2024-01-15 10:35:00",
    "deleted_at": null,
    "device_id": "018d..."
}

Tag

{
    "name": "Work",
    "parent_id": null,
    "created_at": "2024-01-15 10:30:00",
    "modified_at": "2024-01-15 10:30:00",
    "deleted_at": null,
    "device_id": "018d..."
}

Note Tag (Association)

{
    "note_id": "018d...",
    "tag_id": "018d...",
    "created_at": "2024-01-15 10:30:00",
    "modified_at": "2024-01-15 10:30:00",
    "deleted_at": null,
    "device_id": "018d..."
}

Audio File

{
    "filename": "recording_2024-01-15.m4a",
    "file_path": "/path/to/audio/recording.m4a",
    "mime_type": "audio/mp4",
    "file_size": 1234567,
    "duration_ms": 60000,
    "created_at": "2024-01-15 10:30:00",
    "modified_at": "2024-01-15 10:30:00",
    "deleted_at": null,
    "device_id": "018d..."
}

Note: The file_path is the original path on the creating device. Clients should download the actual binary via /sync/audio/<audio_id>.

Note Attachment

{
    "note_id": "018d...",
    "attachment_id": "018d...",
    "attachment_type": "audio_file",
    "created_at": "2024-01-15 10:30:00",
    "modified_at": "2024-01-15 10:30:00",
    "deleted_at": null,
    "device_id": "018d..."
}

Transcription

{
    "audio_file_id": "018d...",
    "language": "en",
    "text": "Transcribed text content...",
    "provider": "whisper",
    "model": "large-v3",
    "segments": "[{\"start\": 0.0, \"end\": 2.5, \"text\": \"Hello\"}]",
    "state": "original",
    "created_at": "2024-01-15 10:30:00",
    "modified_at": "2024-01-15 10:30:00",
    "device_id": "018d..."
}

Audio File Transfer

Audio files are transferred separately from metadata:

Download: GET /sync/audio/<audio_id>

  • Returns raw binary audio content
  • Content-Type header indicates MIME type

Upload: POST /sync/audio/<audio_id>

  • Send raw binary audio content in request body
  • Set Content-Type header appropriately

Sync Flow

Initial Sync (First Connection)

  1. POST /sync/handshake - Exchange device identities
  2. GET /sync/full - Pull complete dataset from peer
  3. Apply all changes locally (respecting dependency order)
  4. For each audio_file, GET /sync/audio/<id> to download content
  5. POST /sync/apply - Push local changes to peer
  6. For each local audio_file, POST /sync/audio/<id> to upload content

Incremental Sync

  1. POST /sync/handshake - Exchange identities, get last_sync_timestamp
  2. GET /sync/changes?since=<last_sync_timestamp> - Pull changes
  3. Apply changes locally
  4. Download any new audio files
  5. POST /sync/apply - Push local changes since last sync
  6. Upload any new local audio files
  7. Store new server_timestamp as last_sync_timestamp for next sync

Conflict Handling

When both devices modify the same entity between syncs, a conflict is created:

  • Note content conflict: Both edited the same note
  • Note delete conflict: One edited, one deleted
  • Tag rename conflict: Both renamed the same tag

Conflicts are stored in dedicated tables and must be resolved manually. The apply endpoint returns the conflict count.

Timestamp Format

CRITICAL: All timestamps MUST be in format YYYY-MM-DD HH:MM:SS with zero-padded values.

  • Correct: 2024-01-05 09:30:00
  • Wrong: 2024-1-5 9:30:00 (not zero-padded)
  • Wrong: 2024-01-05T09:30:00 (ISO 8601 format)

Timestamps are compared as strings for Last-Write-Wins logic. Non-padded dates break lexicographic ordering.

Implementing a New Client

To implement sync in a new client:

  1. Store device identity: Generate a UUID7 for device_id, store with device_name

  2. Track sync state: Store last_sync_timestamp per peer

  3. Implement change tracking: Track local changes since last sync (by modified_at)

  4. Handle all entity types: Implement create/update/delete for all 6 entity types

  5. Respect dependency order: Apply changes in correct order

  6. Handle audio files: Download/upload binary content separately

  7. Handle conflicts: Store conflicts for user resolution

  8. Validate timestamps: Ensure all timestamps use correct format

Validation Rules

Field Rule
UUID 32 hex chars (simple) or 36 chars with hyphens
Datetime YYYY-MM-DD HH:MM:SS format, strict
Tag name 1-255 characters, no forward slashes
Tag path Slash-separated tag names
Note content 1 - 102,400 bytes (100KB max)

Dependencies

Production

Category Crates Purpose
Async tokio Async runtime
Database rusqlite (bundled) SQLite driver
Serialization serde, serde_json JSON encoding
IDs uuid (v7) UUID7 generation
HTTP reqwest, axum Client and server
TLS rustls, rcgen Pure-Rust TLS
Crypto sha2, base64 Hashing and encoding
Errors thiserror, anyhow Error handling
Dates chrono Datetime operations
Merging diffy, similar Diff algorithms

Development

Crate Purpose
tempfile Temporary files for testing

Integration

Python Bindings

VoiceCore is designed for PyO3 integration via the voice-python crate:

rust/voice-python/
├── Cargo.toml
└── src/
    └── lib.rs  # PyO3 bindings

Build with maturin:

cd rust/voice-python
maturin develop --release

Android

VoiceCore can be compiled for Android targets:

# Add Android targets
rustup target add aarch64-linux-android armv7-linux-androideabi

# Build with cargo-ndk
cargo ndk -t arm64-v8a -t armeabi-v7a build --release

Testing

# Run all tests
cargo test

# Run with output
cargo test -- --nocapture

# Run specific test
cargo test test_create_note

# Run tests for a specific module
cargo test database::tests

License

GPL version 3.0 or above

Authorship

  • Written by Dotan Cohen.
  • Extensive assistance from Anthropic Claude via Claude Code.

Related Projects

  • Voice Desktop - Python desktop application using VoiceCore
  • voice-android (planned) - Android application using VoiceCore

About

Core library for applications in the Voice family.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages