quicknotes/specs/readlist.md

5.1 KiB

Readlist Specification

Overview

The Readlist module allows users to save web articles for later reading. It provides a "read it later" service similar to Pocket or Instapaper, with automatic content extraction and a clean reading experience.

Data Model

ReadLaterItem

The ReadLaterItem entity has the following attributes:

Field Type Description
ID string Unique identifier for the item (UUID)
URL string Original URL of the article
Title string Title of the article
Content string Extracted HTML content of the article
Description string Brief description or excerpt of the article
CreatedAt timestamp When the item was saved
UpdatedAt timestamp When the item was last updated
ReadAt timestamp When the item was marked as read (null if unread)
ArchivedAt timestamp When the item was archived (null if not archived)

Features

Item Management

  1. Save Article: Users can save articles by providing a URL
  2. View Article: Users can view saved articles in a clean, reader-friendly format
  3. Mark as Read: Users can mark articles as read
  4. Archive Article: Users can archive articles to remove them from the main list
  5. Delete Article: Users can delete articles permanently
  6. List Articles: Users can view a list of all saved articles

Content Extraction

  1. Automatic Extraction: The system automatically extracts the main content from web pages
  2. Title Extraction: The system extracts the title of the article
  3. Description Extraction: The system extracts a brief description or excerpt of the article
  4. HTML Cleaning: The system cleans the HTML to provide a distraction-free reading experience

Filtering and Sorting

  1. Filter by Status: Users can filter articles by read/unread status
  2. Filter by Archive: Users can filter articles by archived/unarchived status
  3. Sort by Date: Users can sort articles by date saved or date read

API Endpoints

Method Endpoint Description
GET /api/readlist List all read later items
POST /api/readlist Save a new article
GET /api/readlist/:id Get a specific article by ID
PUT /api/readlist/:id Update a specific article
DELETE /api/readlist/:id Delete a specific article
PUT /api/readlist/:id/read Mark an article as read
PUT /api/readlist/:id/unread Mark an article as unread
PUT /api/readlist/:id/archive Archive an article
PUT /api/readlist/:id/unarchive Unarchive an article

Frontend Routes

Route Description
/readlist List of saved articles
/readlist/:id View a specific article

Implementation Details

Content Extraction

The system uses the go-readability library to extract content from web pages:

func (r *ReadLaterItem) ParseURL() error {
    article, err := readability.FromURL(r.URL, 30*time.Second)
    if err != nil {
        return fmt.Errorf("failed to parse URL: %w", err)
    }

    r.Title = article.Title
    r.Content = article.Content
    r.Description = article.Excerpt
    return nil
}

HTML Sanitization

The extracted HTML content is sanitized to remove potentially harmful elements and provide a consistent reading experience:

  1. Remove JavaScript and other active content
  2. Preserve images, links, and basic formatting
  3. Apply a consistent style to the content

Status Management

The system tracks the status of articles using nullable timestamp fields:

  1. ReadAt: When set, indicates the article has been read
  2. ArchivedAt: When set, indicates the article has been archived

User Interface

Article List

  • Displays a list of saved articles with titles, descriptions, and dates
  • Provides filters for read/unread and archived/unarchived status
  • Includes buttons for marking as read, archiving, and deleting

Article Viewer

  • Displays the article content in a clean, reader-friendly format
  • Preserves images and links from the original article
  • Provides buttons for marking as read, archiving, and returning to the list

Save Form

  • Input field for the URL to save
  • Automatic extraction of content after submission
  • Preview of the extracted content before saving

Shiori Import

The Readlist module now supports importing bookmarks from a Shiori instance. This feature allows users to migrate or synchronize their bookmarks by connecting to a Shiori service using their credentials.

API Endpoint

  • POST /api/readlist/import/shiori: Accepts a JSON payload containing url, username, and password. The backend authenticates with the Shiori instance, fetches bookmarks, and creates corresponding read later items.

Frontend Integration

  • A form in the readlist UI accepts Shiori credentials. The readlist store includes an importFromShiori method that sends a request to the endpoint and processes the response, updating the list of saved articles accordingly.

Error Handling

  • Both the backend and frontend provide clear error messages if authentication or bookmark retrieval fails.