Melian

Vision

Overview

Melian can extract text from images using Google Cloud Vision API. This powers the Remarkable tablet integration (converting handwritten notebook pages to text), document scanning, and any workflow that starts with a photo of text.

Backend

Backend Use case
Google Cloud Vision Handwriting, low-quality scans, complex layouts, production accuracy

Tools

Tool Parameters Description
ocr_image url?: string, file_path?: string, base64?: string Extract text from an image by URL, local file path, or base64 data
remarkable_ocr document: string, page: number OCR a specific page of a Remarkable document
remarkable_view document: string, page: number Return the raw image of a Remarkable document page without OCR

ocr_image Parameter Details

Parameter Type Required Notes
url string one of url/file_path/base64 Publicly accessible image URL
file_path string one of url/file_path/base64 Absolute path to a local image file
base64 string one of url/file_path/base64 Base64-encoded image data

remarkable_ocr / remarkable_view Parameter Details

Parameter Type Required Notes
document string yes Document name or ID as synced via Remarkable Connect
page number no 1-indexed page number (default 1)

Remarkable Integration

Remarkable documents are synced via Remarkable Connect. The vision tools read pages from the local sync directory and pass them to Google Cloud Vision for OCR. This enables workflows like: write notes on the tablet, ask Melian to read them, save to knowledge base.