Skip to content

Investigate Twitter data scraping options to recover own data #30

@sqiouyilu

Description

@sqiouyilu

Your archive includes your content, but may become harder to access with platform instability.

Your archive DOES NOT INCLUDE a list of the accounts you’re following or that are following you. The only way to retrieve that is with paid services (unless I can find a free or command line option, but those are likely to be extremely not user-friendly for the average person trying to set up a nest).

Ethical concerns

It is possible to pay for someone else’s data, assuming they’re a public account. But although their Tweets may be public, they’re still content that they’ve authored and that they haven’t consented to having archived.

However, there is no way to prevent this behavior from users. An attempt to verify that a person is uploading their own Tweets, like forcing an oAuth login to check against Tweets, would prevent people from merging their own data across multiple usernames or from a deleted account.

The ethics of scraping someone’s publicly available profile data (display name, handle, ID, bio, URL) are also a bit gray. You could go and copy & paste everyone’s data manually into your own address book, and people generally wouldn’t have a problem with that. There’s really no way to control what people will scrape and import (or even that it has to come from Twitter). But the data should, as much as possible, be restricted to the backend, to prevent people from using nests as directories for targeted harassment.

Used

ExportData.io

  • Clean interface
  • User-friendly
  • Affordable ($3.49 for export of 4271 followers and 4234 followings of @sqiouyilu, but pricing increases with more data)
  • CSV contained all fields I wanted to save EXCEPT user photo

FollowersAnalysis

  • Includes report with breakdown of account statistics, useful more to marketers than to individuals
  • Includes link to profile photo in CSV, but had fewer fields available than ExportData overall
  • Pricier ($39 for one report)

Vicinitas

  • Offers a few free tools, including a limited followers tool, if that’s all you need, but you can’t retrieve a following list
  • Scaled pricing ($5 for up to 20,000 followers, but $20 for 125,000 followers)

Not yet used

Scrapy

  • Command-line, need to investigate.

Metadata

Metadata

Assignees

Labels

researchingQuestions and issues that need more research to determine whether implementation is possible

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions