Skip to content

tags:Problems and inconsistencies in how tags are created from folder names when importing bookmarks (from at least Firefox) #1897

@znoteer

Description

@znoteer

Some multi-word folder names were split into multiple tags, one per word. Other folder names were left intact as multi-word tags.

Bookmarks -> Toolbar -> suivre -> Énergie hors réseau, chauffage alternatif -> woodgas

produced the tags

bookmarks ­­ · toolbar · suivre · Énergie · hors · réseau · chauffage · alternatif · woodgas

but,

Bookmarks -> Toolbar -> suivre -> Énergie hors réseau, chauffage alternatif -> Éolien

was broken apart differently giving

bookmarks · toolbar · suivre · Énergie hors réseau · chauffage alternatif · éolien

Notice also that the captial É of Énergie was preserved whereas the initial caps of Bookmarks and Éolien were converted to lower case.

Bookmarks -> Toolbar -> suivre -> Ordi, PCB, SoC -> PCBs (RaspPi, Arduino, Beagle, etc) -> Raspberry Pi computers

gave

bookmarks · toolbar · suivre · ordi · pcb · soc · pcbs · rasppi · arduino · beagle · etc · raspberry · pi · computers

whereas the same path for a different bookmark came out as

bookmarks · toolbar · suivre · ordi · pcb · soc · pcbs rasppi · arduino · beagle · etc · raspberry · pi · computers

where "PCBs (RaspPi" come out as a single tag.

Likewise,

Bookmarks -> Toolbar -> suivre -> DEL -> Lampes de poche, lanternes

for "Lampes de poche" generated tags "lampes", "de", "poche" and "lampes de poche" depending on the bookmark being imported.

Same thing for "Maison intelligente" and "file systems" in the paths

Bookmarks -> Toolbar -> suivre -> Maison intelligente, senseurs, sécurité

and

Bookmarks -> Computers -> file systems, synchronization

In the paths

Bookmarks -> Toolbar -> projetPBX -> VOIP -> IP-PBX, passerelles PSTN, ATA, cartes expansion and

Bookmarks -> Toolbar -> Search -> Phone, Area Codes

"passerelles PSTN", "cartes expansion" and "Area Codes" were all preserved as 2-word tags.

My preference would be for multi-word folder labels to be preserved as multi-word tags and for capitalisation to be preserved.

Some punctuation is also not handled gracefully. Some folder had datestamps as names. The ':' (colons) in the time part disappeared.

Other folders had a '/' (forward slash, divide-by sign) in the name. In that case the words on either side were concatenated into a single tag. For example, the folder name "one/two" would be converted to "onetwo". It would be preferable to have two tags.

Ampersands ('&') were handled not bad, being replaced by the word "amp". It would be more helpful if the '&' were converted to the word "and", and, as above, if the multi-word label were converted to a multi-word tag.

Finally, apostrophes ("'") were converted, for whatever reason to "39", so that "Can'n" and "l'espace" became "can39n" and "l39espace"

At the end of the process, I also ended up with the tags "0", "1" and "2". I have no idea where they came from. I looked through all my bookmark folder hierarchy and see no isolated digits that could have generated them. "0", if clicked, returns the complete list of all my bookmarks and url-less notes, despite having only 9 members according to the tag cloud. "1" and "2" return emply lists when clicked despite reportedly having 4 and 3 members respectively.

Metadata

Metadata

Assignees

Labels

bugit's broken!

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions