-
Notifications
You must be signed in to change notification settings - Fork 5
Scripts Used
*The idea was to build a demo a fast as possible. The quality of the code or even leaving the code in a usable state were not of the essence.
The scripts list below are all available here: https://github.com/patterns-dev/patterns/tree/gh-pages/apps*
###File Scraping The original site is password protected using cookies.
When you use a PHP routine to access the data, the PHP opens a new tab/window and this forces the the site to request name and email address etc.
I am not yet sufficiently capable of dealing with cookies in PHP to code a routine to that would deal with this issue.
There may well be some easy stuff I should learn...
Hello. See W3Schools.
I therefore resorted to a "human being simulator" - AutoHotkey.
This app was used to call up the web pages and save them to particular folders:
patterns-url-download-to-file.ahk
###Image Scraping Luckily it is possible to view and grab any of the images with a URL:
I used this PHP routine to grab the images and diagrams:
patterns-content-find-images.php
###Making Folders Each pattern is in its own folder. This allows for easy storage of text and images:
Editing the parameters, I used this routine a number of times to create the 253 folders:
patterns-new-folders.php
###HtmlTidy The original files are very messy, very 1990s.
I used this routine to run through all the folders and tidy the HTML:
patterns-html-tidy.php
###More Cleanup Even after the HTML tidy, there was still a lot of detritus in the files.
I greatly improved my regex skills in building this file:
patterns-preg_replace-files.php
###Cleaning Up the Menu File The menu is in a separate HTML file
I used this routine to replace the ALL UPPERCASE TEXT with something nicer:
patterns-clean-navigation.php
###Extract Categories from Menu Files The menu files contained something like a dozen sub-categories for patterns.
I used this file to find and extract categories:
patterns-extract-cats.php
###Creating Summaries There are a number of web sites that will allow you to supply a link or text and return a generated summary of that text.
[smmry.com](http://smmry.com} seemed the simplest and provides an API to boot
Even with the API I had issues accessing the results using PHP, so again I resorted to using AutoHotkey:
patterns-smmry-com-get-and-download-to-file.ahk
###Building the Demo Menu Now we are getting to a moderately interesting bit.
A marvelous thing about GitHub is that you can use it to host and display simple web pages that are built of HTML, CSS and JavaScript files. This facility is automagically available through a special branch in every repositoryy called gh-pages.
The goal is to not only store but also make available for viewing by end-users any and all patterns.
There are 253 patterns in 253 files. It would be totally lame to embed the full menu in each file. So I easily built a web page that embeds the menu in one iframe on the left and the text in an iframe on the right. A few minutes work, it looked great and no matter where you clicked the long scrolling menu stayed in the same place.
I proudly uploaded the demo to GitHub. Worked just fine - for about five minutes. Then, each time clicked for a new page a message popped up saying that GitHub does not support iframe paging. Fail.
Remember as well that GitHub also does not support PHP or includes or anything else normal servers provide. GitHub is a content delivery network and that's it.
GitHub does, however, support web pages that load JavaScript files at runtime. So I decided to put all the menus in a single JavaScript files. In general this works quite well. The only issue is the scrolling of the menu window. When you load a new page the current menu item scrolls to the top - whereas just previously it may have been in the middle, I spent too many hours diddling with DOM scrollIntoView, window.location.hash, window.location.href but could not get anything better than what there is now: anchors at the end of the URLs.
Building a menu with nearly 300 lines is not something you want to do by hand - especially when the menu is frequently changing.
The following PHP file not only builds the menu, it also sets up the styles, titles and various divs in each of the to be applied to the 253 pattern files when they are loaded:
patterns-make-menu.php
###Building the Demo Content Files This PHP file grab data from other folders and builds the final 253 files.
For each file the routines sources annd outputs the following data:
Taken from a text file of one title per line -
- Pattern number
- Pattern title
Calculated using internal if elseif's -
- Major Category
- Sub-Category
Reading and parsing external source files and finding all the embedded links, writing out the new links appriate to the current folder structure
- Up links
- Horizontal links
- Down links
Creates links to the original files Creates links to the appropriate section using Google Books.
Copies the summaries from the smmry folders Extracts the keywords from the summaries files
Creates a link to Google Image search using the keywords
Adds the tags that load the me and adds header and footer tags
The file is here:
patterns-new-pat.php
###Final Thoughts
A nice thing is that the whole operation of running the menu and content builders only take a few seconds.
All of this is entry level script kiddie coding. It would be quite reasonable to assume that a junior level coder could take on this kind of work fairly easily and could output at least one new book per day.
Further analysis could probably increase the output by an order of magnitude.