CLICK TO JUMP
THESE MUST BE INSTALLED FOR YOUR PRACTICAL
Name | Download link | Notes |
---|---|---|
Gephi | https://gephi.org/users/download/ | You must install this for your practical |
Openrefine | http://openrefine.org/download.html | You should already have this installed |
Sublime Text | https://www.sublimetext.com/ | You should already have this installed |
CSV Editor | https://github.com/ScriptSmith/csveditor/ | Instructions for use are on that page. If on a mac you need to right-click the program to open it the first time |
Reaper >= v0.1.7 | https://github.com/ScriptSmith/reaper/releases | Must be a version greater than or equal to v0.1.7 |
These are the only functions you should attempt to graph in Gephi
Click the blue to jump to the section on the page
-
- Post's comments
- Page's posts' comments
- Group's posts' comments
-
- Search's tweets
- Hashtag's tweets
-
- Thread's comments
- Search's threads' comments
- Subreddit's threads' comments
-
YouTube (Not currently available)
- Video's comments
- Search's videos' comments
- Channel's videos' comments
The following are instructions for scraping from a source in Reaper, editing the files it extracts and viewing them in Gephi
Tick the box to include information from the original post
In the original post's fields, make sure From
is ticked
In the comment's fields, make sure Parent
is ticked
Save it as a CSV
Open it in OpenRefine
Rename from.name
to Source
Add a column named Target
based on original_post.from.name
Expression: if(isNonBlank(rows.cells["parent.from.name"]), rows.cells["parent.from.name"].value, value)
Export from OpenRefine
Tick the box to include information from the original post
In the original post's fields, make sure From
is ticked
In the comment's fields, make sure Parent
is ticked
Select posts
as the post type if you want posts from the page, select feed
from the post type if you want posts from others as well
Save it as a CSV
Open it in OpenRefine
Rename from.name
to Source
Add a column named Target
based on original_post.from.name
Expression: if(isNonBlank(row.cells["parent.from.name"]), row.cells["parent.from.name"].value, value)
Export from OpenRefine
Tick the box to include information from the original post
In the original post's fields, make sure From
is ticked
In the comment's fields, make sure Parent
is ticked
Save it as a CSV
Open it in OpenRefine
Rename from.name
to Source
Add a column named Target
based on original_post.from.name
Expression: if(isNonBlank(row.cells["parent.from.name"]), row.cells["parent.from.name"].value, value)
Export from OpenRefine
Make sure that your search topic is recent and trending. You may want to select recent
as your Result type
Save it as a CSV
Open it in CSV Editor
Select the following columns
Save it from CSV Editor as a new file
wait until New Row Count
is the same as Old Row Count
to confirm it is finished saving
Open the new file in Sublime text
Rename user.screen_name
to Source
Rename retweeted_status.user.screen_name
to Target
Save the file
Make sure that your hashtag is recent and trending. You may want to select recent
as your Result type
Save it as a CSV
Open it in CSV Editor
Select the following columns
Save it from CSV Editor as a new file
wait until New Row Count
is the same as Old Row Count
to confirm it is finished saving
Open the new file in Sublime text
Rename user.screen_name
to Source
Rename retweeted_status.user.screen_name
to Target
Save the file
Check the box that says Include parent
Note that when it is checked, Reaper can only download a maximum of 500 comments / thread
Save it as a CSV
Open it in Sublime Text
Rename data.author
to Source
Rename parent.data.author
to Target
Save the file
Check the box that says Include parent
Note that when it is checked, Reaper can only download a maximum of 500 comments / thread
Save it as a CSV
Open it in Sublime Text
Rename data.author
to Source
Rename parent.data.author
to Target
Save the file
Check the box that says Include parent
Note that when it is checked, Reaper can only download a maximum of 500 comments / thread
Save it as a CSV
Open it in Sublime Text
Rename data.author
to Source
Rename parent.data.author
to Target
Save the file
Not currently available
When first setting up Gephi, make sure your plugins are up-to-date
Tools
-> Plugins
Click the Check for Updates
button and follow the process to install the updates for your plugins
Import data by going to File
-> Import spreadsheet
Select the CSV file you want to import and import it as an Edges Table
If the warning Found row(s) with empty Source and/or Target columns
appears and it won't let you click Next>
, make sure you've updated your plugins
Click Next>
and then Finish
There are 3 viewing modes, Overview
, Data Laboratory
and Preview
In Data Laboratory
select Copy data to another column
and select Id
Then select Label
Now in the Overview
, when you click the button to add labels (the black T
at the bottom), you can see the node's name
Choose the Force Atlas 2
Layout
Press the Run
button to run the Layout
Press the Stop
button to stop it when the graph stops moving significantly
Use the Expansion
and Contraction
layouts to expand and contract nodes. Make the scale factor > 1 to expand, > 0 and < 1 to contract
Click the spyglass on the left toolbar to see the entire graph
Gephi doesn't allow for parallel edges, so it merges those edges into a single edge.
If you want to visualize the frequency (how often nodes are connecting) of edges between nodes, you need to include a weight
-
In Openrefine / CSV Editor, remove all the columns other that the
Source
andTarget
We do this because we need to remove the unique identifiers for particular edges, which prevents merging edges
-
Add a new column based on the
Source
column -
Call it
Weight
, set the value of the expression to just be1
-
Export the CSV
Now when you view the network, parallel edges will be merged so that their weight is increased according to the number of parallel edges
See the quick-start guide to see what analsis you can do in Gephi