docs(specs): update feeds, documents, and omnisearch specs with latest changes
This commit is contained in:
parent
f75a7a0c09
commit
b1ac38d4ab
3 changed files with 169 additions and 0 deletions
59
specs/documents.md
Normal file
59
specs/documents.md
Normal file
|
@ -0,0 +1,59 @@
|
|||
# Documents Specification
|
||||
|
||||
## Overview
|
||||
|
||||
The Documents domain allows users to upload and view PDF documents within the QuickNotes application. This feature is designed to extend the application so that users can manage and view important documents without having to download them.
|
||||
|
||||
## Data Model
|
||||
|
||||
A new entity, `Document`, will be introduced with the following attributes:
|
||||
|
||||
| Field | Type | Description |
|
||||
|-------------|-----------|-------------------------------------------------|
|
||||
| ID | string | Unique identifier for the document (UUID) |
|
||||
| Filename | string | Original filename of the uploaded document |
|
||||
| Path | string | Server-side path or storage reference |
|
||||
| URL | string | Public URL to access the document (if needed) |
|
||||
| UploadedAt | timestamp | When the document was uploaded |
|
||||
| UpdatedAt | timestamp | When the document was last updated |
|
||||
|
||||
## Storage Architecture
|
||||
|
||||
There are two options for storing PDF documents:
|
||||
|
||||
1. **Local Disk Storage**: Documents are stored on the local server's filesystem. This method is simple and sufficient for local or small-scale usage.
|
||||
|
||||
2. **Blob Storage (e.g., S3)**: For scalability and better durability, a cloud-based blob storage system (like Amazon S3) could be used. For now, local disk storage is assumed.
|
||||
|
||||
|
||||
## API Endpoints
|
||||
|
||||
| Method | Endpoint | Description |
|
||||
|--------|----------------------|--------------------------------------------------|
|
||||
| POST | /api/documents | Upload a new PDF document |
|
||||
| GET | /api/documents | List all uploaded documents |
|
||||
| GET | /api/documents/:id | Retrieve details of a specific document |
|
||||
| DELETE | /api/documents/:id | Delete a specific document |
|
||||
|
||||
### Upload Endpoint
|
||||
|
||||
- Accepts multipart/form-data with a file field (PDF).
|
||||
- Validates that the uploaded file is a PDF.
|
||||
- Stores the file in the designated storage (local disk for now).
|
||||
- Returns metadata about the uploaded document.
|
||||
|
||||
## Frontend Integration
|
||||
|
||||
### PDF Viewer
|
||||
|
||||
The frontend will integrate a PDF viewer (similar to PDF.js) to display the PDF inline without forcing a download:
|
||||
|
||||
1. **Upload Form**: A page where users can select and upload a PDF document.
|
||||
2. **Viewer Page**: A page that embeds a PDF viewer component to display the document. This viewer supports zoom, search within the PDF, and navigation between pages.
|
||||
|
||||
|
||||
## Implementation Considerations
|
||||
|
||||
1. **Security**: Validate file types and implement proper access controls.
|
||||
2. **Performance**: If stored locally, ensure that the server has sufficient storage and backup mechanisms.
|
||||
3. **Scalability**: For future scalability, consider integrating with a cloud blob storage service (e.g., Amazon S3) and a CDN for faster delivery.
|
|
@ -42,6 +42,30 @@ The `Entry` entity represents a single item in a feed:
|
|||
| CreatedAt | timestamp | When the entry was added to the system |
|
||||
| UpdatedAt | timestamp | When the entry was last updated in the system |
|
||||
|
||||
## UUID Generation
|
||||
|
||||
Both Feed and Entry entities have GORM BeforeCreate hooks which ensure that a UUID is automatically generated if the ID is not provided:
|
||||
|
||||
- For the Feed entity:
|
||||
```go
|
||||
func (f *Feed) BeforeCreate(tx *gorm.DB) error {
|
||||
if f.ID == "" {
|
||||
f.ID = uuid.New().String()
|
||||
}
|
||||
return nil
|
||||
}
|
||||
```
|
||||
|
||||
- For the Entry entity:
|
||||
```go
|
||||
func (e *Entry) BeforeCreate(tx *gorm.DB) error {
|
||||
if e.ID == "" {
|
||||
e.ID = uuid.New().String()
|
||||
}
|
||||
return nil
|
||||
}
|
||||
```
|
||||
|
||||
## Features
|
||||
|
||||
### Feed Management
|
||||
|
|
86
specs/omnisearch.md
Normal file
86
specs/omnisearch.md
Normal file
|
@ -0,0 +1,86 @@
|
|||
# Omnisearch Specification
|
||||
|
||||
## Overview
|
||||
|
||||
The Omnisearch domain provides a unified full-text search functionality across all domains of the QuickNotes application. It allows users to perform text searches on notes, feed entries, read later items, and uploaded PDF documents, and returns a list of results that link to the appropriate detail pages.
|
||||
|
||||
## Architecture and Components
|
||||
|
||||
### Full-Text Search Engine
|
||||
|
||||
To support full-text search, a search engine such as [Bleve](https://github.com/blevesearch/bleve) can be integrated into the backend. Bleve is a full-text search and indexing library for Go and is well-suited for this purpose.
|
||||
|
||||
### Indexing
|
||||
|
||||
The following content types will be indexed for search:
|
||||
|
||||
- **Notes**: Index title and content
|
||||
- **Feed Entries**: Index title, summary, and full content
|
||||
- **Readlater Items**: Index title, description, and content
|
||||
- **Documents**: Extract text from PDF documents (using an OCR or PDF text extraction tool, if necessary) and index the extracted content
|
||||
|
||||
### API Endpoints
|
||||
|
||||
| Method | Endpoint | Description |
|
||||
|--------|---------------------|-------------------------------------------------------|
|
||||
| GET | /api/search | Perform a full-text search across all domains |
|
||||
|
||||
#### Search API
|
||||
|
||||
- Accepts a query parameter (e.g., `q`) containing the search text.
|
||||
- Optionally accepts pagination parameters (e.g., `limit` and `offset`).
|
||||
- Returns a list of search results, each containing:
|
||||
- **Type**: Indicates the domain (note, feed entry, readlist item, document)
|
||||
- **ID**: The identifier of the result
|
||||
- **Title/Name**: The title or name of the result
|
||||
- **Snippet**: A short excerpt showing the context of the match
|
||||
- **Link**: A URL to the relevant detail page
|
||||
|
||||
Example response:
|
||||
|
||||
```json
|
||||
[
|
||||
{
|
||||
"type": "note",
|
||||
"id": "note-id",
|
||||
"title": "Note Title",
|
||||
"snippet": "... matching text ...",
|
||||
"link": "/notes/note-id"
|
||||
},
|
||||
{
|
||||
"type": "feed-entry",
|
||||
"id": "entry-id",
|
||||
"title": "Entry Title",
|
||||
"snippet": "... matching text ...",
|
||||
"link": "/feeds/entries/entry-id"
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
## Frontend Integration
|
||||
|
||||
### Omnisearch Page
|
||||
|
||||
The frontend will feature a dedicated Omnisearch page with the following components:
|
||||
|
||||
1. **Search Input**: A single input field where users can enter their search query.
|
||||
2. **Search Results List**: A list that dynamically displays the search results returned by the API.
|
||||
3. **Result Filtering**: (Optional) Filters to narrow down results by domain (notes, feed entries, readlist items, documents).
|
||||
|
||||
Navigation to a result should route the user to the corresponding detail page:
|
||||
|
||||
- **Notes**: `/notes/:id`
|
||||
- **Feed Entries**: `/feeds/entries/:id` (or `/feeds/:id` for feed-specific views)
|
||||
- **Readlist Items**: `/readlist/:id`
|
||||
- **Documents**: A new page (e.g., `/documents/:id`) for viewing the PDF using the integrated PDF viewer.
|
||||
|
||||
## Implementation Considerations
|
||||
|
||||
1. **Indexing Strategy**: Index updates should occur whenever content is created, updated, or deleted. This may be triggered in the service layers of each domain.
|
||||
2. **Extraction for Documents**: For PDFs, text extraction might be performed using tools like [pdfcpu](https://github.com/pdfcpu/pdfcpu) or similar to convert PDF content into text for indexing.
|
||||
3. **Performance**: Ensure that the indexing and search queries are performant. Consider asynchronous indexing and regular index updates.
|
||||
4. **Scalability**: Although Bleve works well for local deployments, evaluate if a more robust search solution is needed as the application scales.
|
||||
|
||||
## Summary
|
||||
|
||||
The Omnisearch feature provides a unified search experience, leveraging full-text indexing with Bleve to search across all content domains. It improves usability by allowing users to quickly find information regardless of its domain, with results linking to the appropriate detail pages.
|
Loading…
Add table
Reference in a new issue