Building Media Overlay support on top of Readium #377
Replies: 1 comment 4 replies
-
Hi Shane, Apologies for the delay, I overlooked this notification. We've been working on a JSON and in-memory model to represent a format-agnostic guided navigation compatible with Media Overlays. Although it's currently paused (because lack of time), the goal is to eventually implement Media Overlays with it in the Readium toolkits. You can take a look at the draft and discussions here: readium/architecture#181. I think it would be useful to bring up your hurdles there and see if we can come up with a solution in the model. So far, we have avoided using CFIs in the mobile toolkits. If you have a compelling case that cannot be addressed by any native Web technologies, such as CSS selectors, it is worth reconsidering. Maybe opening a dedicated issue on https://github.com/readium/architecture/discussions? I'm not super familiar with Media Overlays myself, so it's a bit unclear to me why CSS selectors don't work in your case, or why you need to check explicitly for
What's your use case here? Where is the locator coming from?
Using the Media Overlays mapping, I guess you can figure out a locator that looks like:
You might want to use a swift-toolkit/Sources/Navigator/TTS/PublicationSpeechSynthesizer.swift Lines 145 to 151 in ba378e8 Side note: if your CSS selectors always contain only a single HTML ID, you can also use |
Beta Was this translation helpful? Give feedback.
-
Hi folks!
This library is outstanding, and I've been using it to re-write the synced narration EPUB reading system in the Storyteller mobile apps.
I know that synced narration support is planned/being worked on already, but in the meantime, Storyteller needs something, so I've been working on hacking it together on top of what already exists in Readium!
After much tinkering, I have a working system that can sync between the audio position and EPUB reading position on this branch: https://gitlab.com/smoores/storyteller-mobile/-/tree/readium?ref_type=heads. The Swift code that interacts with Readium is in
modules/readium/ios
. The plan is to add back in sentence-level narration highlighting as well, with Decorations.In order to get there, I had to customize a few different pieces of the EPUBParser (which makes sense, since the current version doesn't have anything Media Overlay-specific in there!). What I'm a little less happy about in the current approach is the hacks I had to add in order to find a location/locator based on a Media Overlay "fragment". Essentially, the Media Overlay SMIL files describe a set of entries mapping "clips" (an audio resource + a start and end time) to and from "fragments" (a text resource + a URL fragment pointing to the ID of a specific element in the text), but I had a lot of trouble working out a good way to actually utilize these fragments with the current swift-toolkit, and I was hoping that someone here might have some ideas/things that I missed.
What I would like to be able to do:
At the moment, since Media Overlays only know about clips and fragments, this looks like:
span
with an id starting with the stringsentence
, which is a hack that only works for Storyteller-generated books, because I know that those elements are the ones referenced in the Media OverlayscssSelector
. I think I would rather usepartialCfi
s here, but in the moment I didn't want to build a whole CFI system into theHTMLResourceContentIterator
.cssSelector
as the fragment, search through the corresponding Media Overlay for the correct ClipThis currently looks like:
publication.content()
, starting at the link for the resource identified in the Media Overlay fragment, search through the content until I find a Text Segment with an id starting withsentence
whose sentence count is higher than the one in the fragment. Again, this is relying on Storyteller internals, and not the Media Overlay spec, and in order to make this work I had to also customize theHTMLResourceContentIterator
to break up Text Segments more frequently so that there would always be at least one Segment per sentence span (the default only flushes Segments when the language changes).I think that I could clean this up a bit if, as mentioned, I generated CFIs from the Navigator and HTML iterator, rather than relying purely on
cssSelector
s/element IDs. The CFI spec supports adding IDs to the CFI paths, so it would be possible to go from CFI -> fragment, though going from fragment -> CFI would require iterating the HTML as well, which currently isn't the case.Anyway, I felt it was possible that I was missing something obvious here that would make this easier, so I figured I would ask, in case that was true! If not, this does literally work for now, and I think the CFI approach (though hairy) would allow me to generalize this to all valid Media Overlay EPUBs in the future.
Beta Was this translation helpful? Give feedback.
All reactions