nonbinary.org-data

Temporary attempt to make content of nonbinary.org available. The extraction process is a bit redundant as I couldn't run the main crawler on my machine and I couldn't run extended logic on the web crawling service.

Process

step	artifacts	code
Extract HTML data from the majority of articles on wiki (using Apifier).	`2/README`	N/A
Remove content to speed up processing	`unsorted2.json`	performed using a regex
Extract article names and remove uneeded articles and de-dupe	`2/list.json` `2/times.json`	`tolist.js`
Query archive.org for raw wiki text of articles	`3/pages/*` `3/failed.json`	`runquery.json`

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
1		1
2		2
3		3
.gitignore		.gitignore
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
runquery.js		runquery.js
tolist.js		tolist.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

nonbinary.org-data

Process

About

Releases

Packages

Languages

pronouns/nonbinary.org-data

Folders and files

Latest commit

History

Repository files navigation

nonbinary.org-data

Process

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages