Skip to content

Design Document

Bret Comnes edited this page Aug 23, 2022 · 6 revisions

Breadcrum.net: what is it?

stateDiagram-v2
    [*] --> Breadcrum.net
    Breadcrum.net --> Accounts
    iOS_macOS_App --> Breadcrum.net
    Webextension --> Breadcrum.net
    Web_client --> Breadcrum.net
    Breadcrum.net --> 📧TX_Email
    Accounts --> Account_verification
    📧TX_Email --> Account_verification
    📧TX_Email --> Invites
    Accounts --> Invites
    Accounts --> 🔖Bookmarks
    🔖Bookmarks --> 🏷Tags
    😬Sensitive--> 🔖Bookmarks
    ⭐️Starred --> 🔖Bookmarks
    🔵Unread --> 🔖Bookmarks
    😬Sensitive --> 🏷Tags
    🔖Bookmarks --> 📡Feeds
    📡Feeds --> 📼Episodes
    🔖Bookmarks --> 📼Episodes
    😬Sensitive --> 📼Episodes
    📼Episodes --> ⤴️Redirect_yt_dlp
    📼Episodes --> 🥩raw_url
    📼Episodes --> 🎞yt_dlp_ffmpeg
    🔖Bookmarks --> 🗄Archive
    🗄Archive --> 📁File
    🗄Archive --> 📃Reader_mode_archvie
    🗄Archive --> 🎞yt_dlp_ffmpeg
    🎞yt_dlp_ffmpeg --> 📁File
    📃Reader_mode_archvie --> 📁File
    📃Reader_mode_archvie --> 🗞Send_to_kindle
    📃Reader_mode_archvie --> 🖨Print_queue
    🐦Twitter_archive --> 📁File
    🐦Twitter_archive --> 🧵Thread_unroll
    🧵Thread_unroll --> 📁File
    🗄Archive --> 🐦Twitter_archive

MVP

  • Bookmarking storage and organization. This is the primary 'atom' of the service. Everything hangs off of saving a 'bookmark' of a URL. See http://pinboard.in and https://del.icio.us
  • Bookmarks have a title, description, tags and a read/unread state.
  • Lightweight content archival. Breadcrum saves the 'reader mode' view of a webpage, as well as images it can find. Full text archives provide a more automatic and complete solution to messy hand-organized titles, descriptions and tags. Content archival requires a paid account. Image storage may be subject to additional storage fees. These storage fees should be at cost from b2.
  • Content archival should provide different extraction methods
  • Full text search. Since bookmarks are stored with a consistent archive of the contents, full text search is a lot more useful than the notes or metadata usually associated with bookmarking tools.
  • Podcast anything. Inspired by huffduff-video, when you save a bookmark to breadcrum, you can optionally request that the page be hit with youtube-dl (or yt-dlp) and have the results inserted into your private podcast feed. It let's you queue media and consume a private collage of media from around the web. Podcast media will live for 2 weeks before getting garbage collected. Media can be marked as archived indefinitely but will add to your storage budget.
    • Support audio or video.
    • Support re-publish action for timed-out content.
  • Private only. The social bookmarking concept doesn't work. Bookmark streams are so hyperpersonalized, it's a bad way to 'share' with semi-related groups.
  • Sensitive data mode. You should be able to designate some bookmarks as sensitive. These are not shown by default and require authentication to view. This lets you store sensitive info in the service, but also feel comfortable using it in front of a group by hiding it from plain sight in most situations.
  • Must be fast. Bookmarks should be a single click action and save quickly. All slow extractions must be done async.

Long shot goals

  • Shortcuts to stash and retrieve copies in http://archive.today will also be provided via http://mementoweb.org/guide/quick-intro/
  • Twitter gets a special content extractor
    • Unroll threads
    • Save media
    • Save metadata
  • Read it later.
    • A on-site reading queue that actually unmarks read items as you read them.
    • Print queue. Unread articles can be queued up into a single print job, and marked as 'read' when you print.
    • Send to kindle. Mark as print when sent.
    • Keep a 'read' log and the method used to 'read' it.
  • web extension.
    • Client side content extraction for when servers can't see the content.
  • iOS App/Share sheet. Quick, offline access to breacrum data.
  • macOS App. Same function as iOS but for desktop.
  • e2e encrypted mode. All content in e2e mode is encrypted client side, except for maybe some optional metadata. To view and search it, it must be done client side.
    • Client creates a private key
    • Private key is encrypted with some hash of the password
    • Private key is custodial stored on breadrum.
    • Changing passwords re-encryptes the private key, bookmarks retain their original encryptio.
  • Audio transcripts of extracted videos.
  • Long term archival storage billed at use. Expse a frontend to B2 you want to use.
  • PDF storage with content extraction.
  • Collections mode / folders
  • Recording mode.. Browser extension. Click record. Save bookmarks as you nav.
Clone this wiki locally