Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recomposition / decluttering API #67

Open
dy opened this issue Jun 18, 2020 · 0 comments
Open

Recomposition / decluttering API #67

dy opened this issue Jun 18, 2020 · 0 comments

Comments

@dy
Copy link
Member

dy commented Jun 18, 2020

Current API comprises various concepts and various contexts, mixing them all up does not work well.
Let's try to analyze and clean them up, figure out the core value of the package, distinguished from just a heap of assorted audio aspects. On the way taking notes/ideas.

There are the following apparent contexts.

  • playback (stream to speaker)
  • recording (stream from mic)
  • manipulations
  • rendering & taking statistics (stream to analyzer)
  • reading & decoding (decode-stream from file)
  • saving & encoding (encode-stream to file)
  • generic streaming in (non-mic: web-audio, url, video/youtube, etc.)
  • generic streaming out (non-speaker: web-audio, renderer, generic observable/asyncIterator, icecast, p2p audio etc.)

Originally these concerns are handled each with separate node in audio-processing graph.
But they can be reclassified into:

  • create (from mic, file, buffer, encoded data, web-audio, url, video etc.)
  • read/output (to speaker, analyzer, buffer, encode, stream, observable, renderer, web-audio etc.)
  • manipulate (stack of string/array-like ops + audio-specific ops)
  • navigate (state of reading: seek, cues, skip, playback, rate, etc.)
  • sync/mix (video track, captions track, rendering? track, other tracks)

↑ With different flavors (type of data storage, time units convention, naming, stack of ops vs direct manipulations)

Also, it's worth correlating with MDN Audio - that includes own opinionated subset of operations.

Also, alternative audio modules (wad, aural, howler, ciseaux etc.) each has own subset of operations.

Consider possible concepts.

A. Audio (HTMLAudioElement) for node

! one possible value is to just provide standard Audio container for node.

  • 👍 existing docs (MDN)
  • 👍 compatible with web-audio-api pattern
  • 👎 losing manipulations
  • 👎 that implies implementing more generic Media class with a bunch of associates: AudioTrackList, AudioTrack, TimeRanges, MediaController, MediaError, MediaKeys, MediaDevices, MediaStream, TextTrack, VideoTrack, MediaKeySession - overall looks like an organic part of browser API, not some standalone polyfill.

B. Manipulations toolkit 🌟

  • simple decoded sync data container (AudioBuffer, Float32Array etc. - similar to pxls) - takes in any PCM/float data (likely audio-buffer-from).
  • for loading audio from remote source, use audio-load. For recording and other streaming sources - use corresp. packages.
  • basically extends AudioBuffer with a set of chainable methods (BTW! inherited AudioBuffer is compatible with regular one!)
    • ~ the possible drawback - audiobuffer is immutable - no easy way to trim/slice it etc.
    • ~ also for long (45s+) sources they recommend using MediaElementAudioSourceNode - which is a type of AudioNode.
  • for playing audio - use for example audio-play
  • 👍 ↑ this way, the package can be focused on manipulations only without cramming all into one, and go a bit deeper, eg.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant