Overview
Melian can extract text from images using Google Cloud Vision API. This powers the Remarkable tablet integration (converting handwritten notebook pages to text), document scanning, and any workflow that starts with a photo of text.
Backend
| Backend |
Use case |
| Google Cloud Vision |
Handwriting, low-quality scans, complex layouts, production accuracy |
Tools
| Tool |
Parameters |
Description |
ocr_image |
url?: string, file_path?: string, base64?: string |
Extract text from an image by URL, local file path, or base64 data |
remarkable_ocr |
document: string, page: number |
OCR a specific page of a Remarkable document |
remarkable_view |
document: string, page: number |
Return the raw image of a Remarkable document page without OCR |
ocr_image Parameter Details
| Parameter |
Type |
Required |
Notes |
url |
string |
one of url/file_path/base64 |
Publicly accessible image URL |
file_path |
string |
one of url/file_path/base64 |
Absolute path to a local image file |
base64 |
string |
one of url/file_path/base64 |
Base64-encoded image data |
remarkable_ocr / remarkable_view Parameter Details
| Parameter |
Type |
Required |
Notes |
document |
string |
yes |
Document name or ID as synced via Remarkable Connect |
page |
number |
no |
1-indexed page number (default 1) |
Remarkable Integration
Remarkable documents are synced via Remarkable Connect. The vision tools read pages from the local sync directory and pass them to Google Cloud Vision for OCR. This enables workflows like: write notes on the tablet, ask Melian to read them, save to knowledge base.