Skip to content

Commit

Permalink
Merge pull request #68 from justinowusu/proj03
Browse files Browse the repository at this point in the history
W18 Proj03 -> Project Master
  • Loading branch information
CDLlo authored Mar 21, 2018
2 parents 48a3d70 + 3339f3c commit 90c29f8
Show file tree
Hide file tree
Showing 2 changed files with 12 additions and 0 deletions.
4 changes: 4 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -84,3 +84,7 @@ In addition, make sure to work closely with the group that has cs56-webapps-cur
F17 Final Remarks:

To the next group of students, we would reccomend reading as much of the code as you can before doing aything. Read UCSBCurriculumSearch first that will give you the best idea of how the whole project works. This file will look very ambigous until you inspect the actual HTML code found at the main page URL (https://my.sa.ucsb.edu/public/curriculum/coursesearch.aspx). Once you read this you will understand how the scraping process works. By the time you read this the main page URL may have changed and many of the scraping functions will no longer function. We'd suggest checking out the test files then running ant test first to see what has broken.

W18 Final Remarks:

To the next group of students, this quarter we primarily spent our time converting the parsing structure from substrings to using a parsing library called JSOUP. We recommend looking at `loadCoursesJsoup` located in `UCSBCurriculumSearch.java`. This function (and its helpers) utilizes the JSOUP library and is where the parsing of the HTML happens. We also recommend looking at the newly refactored tests in `UCSBCurriculumSearchTest.java` to see the functionality of the parsing. The GUI interface is still broken and we recommend not investing too much time with it, as the future of the project leans more towards the API interface ( the project *cs56-webapps-curriculum* may eventually end up combining with this one). A good issue to start on would be refactoring the `getPage` method in `UCSBCurriculumSearch.java` and introducing a library to get the HTML from the web page in a more reliable way. The current `gePage` sometimes just doesn't return the page. Try using the Geb web automation library to get the contents of the search page instead of hard coding everything like in `getPage`'s current implementation.
8 changes: 8 additions & 0 deletions w18_proj03.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@

Issues for future students:

- Refactor `getPage` [#66](https://github.com/ucsb-cs56-projects/cs56-scrapers-ucsb-curriculum/issues/66) (200 pts)
- Make an API [#54](https://github.com/ucsb-cs56-projects/cs56-scrapers-ucsb-curriculum/issues/54) (300 pts)
- Expand tests in `UCSBCurriculumSearchTest.java` [#67](https://github.com/ucsb-cs56-projects/cs56-scrapers-ucsb-curriculum/issues/67) (150 pts)

Total points: 650 pts

0 comments on commit 90c29f8

Please sign in to comment.