# Readlist Specification ## Overview The Readlist module allows users to save web articles for later reading. It provides a "read it later" service similar to Pocket or Instapaper, with automatic content extraction and a clean reading experience. ## Data Model ### ReadLaterItem The `ReadLaterItem` entity has the following attributes: | Field | Type | Description | |-------|------|-------------| | ID | string | Unique identifier for the item (UUID) | | URL | string | Original URL of the article | | Title | string | Title of the article | | Content | string | Extracted HTML content of the article | | Description | string | Brief description or excerpt of the article | | CreatedAt | timestamp | When the item was saved | | UpdatedAt | timestamp | When the item was last updated | | ReadAt | timestamp | When the item was marked as read (null if unread) | | ArchivedAt | timestamp | When the item was archived (null if not archived) | ## Features ### Item Management 1. **Save Article**: Users can save articles by providing a URL 2. **View Article**: Users can view saved articles in a clean, reader-friendly format 3. **Mark as Read**: Users can mark articles as read 4. **Archive Article**: Users can archive articles to remove them from the main list 5. **Delete Article**: Users can delete articles permanently 6. **List Articles**: Users can view a list of all saved articles ### Content Extraction 1. **Automatic Extraction**: The system automatically extracts the main content from web pages 2. **Title Extraction**: The system extracts the title of the article 3. **Description Extraction**: The system extracts a brief description or excerpt of the article 4. **HTML Cleaning**: The system cleans the HTML to provide a distraction-free reading experience ### Filtering and Sorting 1. **Filter by Status**: Users can filter articles by read/unread status 2. **Filter by Archive**: Users can filter articles by archived/unarchived status 3. **Sort by Date**: Users can sort articles by date saved or date read ## API Endpoints | Method | Endpoint | Description | |--------|----------|-------------| | GET | /api/readlist | List all read later items | | POST | /api/readlist | Save a new article | | GET | /api/readlist/:id | Get a specific article by ID | | PUT | /api/readlist/:id | Update a specific article | | DELETE | /api/readlist/:id | Delete a specific article | | PUT | /api/readlist/:id/read | Mark an article as read | | PUT | /api/readlist/:id/unread | Mark an article as unread | | PUT | /api/readlist/:id/archive | Archive an article | | PUT | /api/readlist/:id/unarchive | Unarchive an article | ## Frontend Routes | Route | Description | |-------|-------------| | /readlist | List of saved articles | | /readlist/:id | View a specific article | ## Implementation Details ### Content Extraction The system uses the `go-readability` library to extract content from web pages: ```go func (r *ReadLaterItem) ParseURL() error { article, err := readability.FromURL(r.URL, 30*time.Second) if err != nil { return fmt.Errorf("failed to parse URL: %w", err) } r.Title = article.Title r.Content = article.Content r.Description = article.Excerpt return nil } ``` ### HTML Sanitization The extracted HTML content is sanitized to remove potentially harmful elements and provide a consistent reading experience: 1. Remove JavaScript and other active content 2. Preserve images, links, and basic formatting 3. Apply a consistent style to the content ### Status Management The system tracks the status of articles using nullable timestamp fields: 1. `ReadAt`: When set, indicates the article has been read 2. `ArchivedAt`: When set, indicates the article has been archived ## User Interface ### Article List - Displays a list of saved articles with titles, descriptions, and dates - Provides filters for read/unread and archived/unarchived status - Includes buttons for marking as read, archiving, and deleting ### Article Viewer - Displays the article content in a clean, reader-friendly format - Preserves images and links from the original article - Provides buttons for marking as read, archiving, and returning to the list ### Save Form - Input field for the URL to save - Automatic extraction of content after submission - Preview of the extracted content before saving ## Shiori Import The Readlist module now supports importing bookmarks from a Shiori instance. This feature allows users to migrate or synchronize their bookmarks by connecting to a Shiori service using their credentials. ### API Endpoint - POST /api/readlist/import/shiori: Accepts a JSON payload containing `url`, `username`, and `password`. The backend authenticates with the Shiori instance, fetches bookmarks, and creates corresponding read later items. ### Frontend Integration - A form in the readlist UI accepts Shiori credentials. The readlist store includes an `importFromShiori` method that sends a request to the endpoint and processes the response, updating the list of saved articles accordingly. ### Error Handling - Both the backend and frontend provide clear error messages if authentication or bookmark retrieval fails.