-
Notifications
You must be signed in to change notification settings - Fork 0
Algorithms
###Overview ######Goal Our eventual goal is to read crime and map data, and use that to come up with safe and short walking routes.
######Tasks
- Shortest Path
- Crime Prediction
- Combining previous
###Shortest Path Calculation
######Converting to a graph If we have a map of a city, the ways we can travel across that map are limited. We cannot usually travel through buildings or fenced in areas, which means that usually we are limited to moving along streets. We can think of streets as connecting intersections. Then the problem of finding a path from A to B is now about finding which streets to travel across.
This simplification allows us to treat our map as a graph, where vertices represent intersections and edges represent streets. While walking is more complicated in this respect than driving, more on that here, walking paths generally also allow for this simplification.
######Shortest Path on a Graph Dijkstra's algorithm finds the shortest path between two points in O(E log V) time where V is the number of vertices, and E is the number of edges. For graphs with a large number of edges and vertices (for example, all of the streets in baltimore, this algorithm can take too long to calculate.
There are other algorithms which approximate (or exactly compute) shortest paths, but do so in a shorter time. In general, the idea is that they use landmarks and add redundant edges to reduce the computational time required to find a shortest path. For more details, see references
######References http://research.microsoft.com/en-us/news/features/shortestpath-070709.aspx http://research.microsoft.com/apps/pubs/default.aspx?id=60764 http://www.cs.princeton.edu/courses/archive/spr09/cos423/Lectures/reach-mit.pdf
###Crime Estimation
######Problem Statement In general, we want to know the crime probability of a particular location at a particular time in order to determine how safe that area is to walk through. However, crime data normally comes in discrete incidents. Instead of information about the probability of crime at a location, we instead have records of crimes that actually occurred. We want to use these crime records to help us estimate the probability of crimes occurring in the future.
#####Possible Solutions:
######Level Set Estimation We are not interested in determining exactly the amount of crime in a region. Rather, we care more about finding regions with high crime, and regions without high crime (at least to start). With this in mind, we can switch focus from estimating the probability of crime, to estimating the region where the probability of crime is greater than some percent. Of course, if we do this for several percents, we have an idea of the distribution of crime in the area.
######Time Series Analysis The distribution of crime varies over time. The probability of crime occurring in the same place may change based on a large number of factors. Time Series techniques are designed to predict the evolution of time dependent behaviour as they evolve. In our case, this would represent the difference between predicting whether this path has been relatively safe over the past 20 years, or it is relatively safe now.
######Topological Data Analysis We can view the crime data as a point cloud, and try to figure out things about it geometrically/topologically. We can think of the crimes as circles representing hotspots, and push out from them. We can look at how the area affected changes as we push out more and more. We can use this to figure out things like how large the features of our crime set are and other things. Moreover, we can apply this to the time component as well, and see how features in time dependence persist with respect to time.
######References https://plot.ly/python/alpha-shapes/#generating-alpha-shape-of-a-set--of-2d-points https://datafloq.com/read/los-angeles-police-department-predicts-fights-crim/279 http://bgr.com/2016/05/23/court-risk-assessment-algorithm-northpointe/ http://graymattersystems.com/big-data-crime-patterns/ https://datafloq.com/read/los-angeles-police-department-predicts-fights-crim/279 http://www.iaca.net/dc_about_ca.asp http://www.analyticsvidhya.com/blog/2015/12/complete-tutorial-time-series-modeling/ http://www.ams.org/journals/bull/2008-45-01/S0273-0979-07-01191-3/ http://www.ncbi.nlm.nih.gov/pubmed/9222792 http://nowak.ece.wisc.edu/SLT09.html
###Putting them together Suppose we have the following things:
- An algorithm for finding shortest paths given a graph
- An estimation of the crime in an area (either several level sets, or an actual estimate)
Then we can combine them in the following way Take our graph, and adjust the distances by the crime level. Increase them for paths which pass through high crime areas. Decrease them for paths which pass through low crime areas.