Converting Lat/Lon to Zip Code
I noticed a question on the Analytics X Prize forum about how to determine the zip code for homicides with latitude and longitude values. While there are a plethora of online tools (Google Maps, etc) that will do this for you, I thought I'd describe a simple way to do it using PostgreSQL/PostGIS as it illustrates one aspect of the multitude of open source tools that aid in spatial analysis. Also, the described method can be easily automated in combination with a shell script and some db insert triggers.
First, I retrieved the incident data from the resource described in an earlier post. After some awk and sed wrangling, I got the data into a format where it could be imported into a PostgreSQL table with the following structure:
philly=# d incidents Table "public.incidents" Column | Type | Modifiers -----------+--------------------------+----------- id | bigint | date | timestamp with time zone | geom | geometry | zip | integer | Indexes: "inc_gist_idx" gist (geom)Notice that there is a geometry column which specified the geocoded location of the homicide. The data came down projected using SRID 26918 - UTM Zone 18. I had to reproject the zip code geometries as they came as unprojected lat/lon. The zip code table (which I retrieved from the source listed in an earlier post) had the following structure:
philly=# d philly Table "public.philly" Column | Type | Modifiers ------------+-----------------------+----------- gid | integer | area | numeric | perimeter | numeric | zt42_d00_ | bigint | zt42_d00_i | bigint | zcta | character varying(5) | name | character varying(90) | lsad | character varying(2) | lsad_trans | character varying(50) | the_geom | geometry | Indexes: "philly_gist_idx" gist (the_geom)
Now, the zip code column of the incidents table is empty. I used the following select statement to populate the zip code column with the proper zip code which it falls in:
update incidents set zip= (select cast(name as integer) from philly where contains(transform(the_geom,26918), geom));The statement is selecting the zip code name from the zip code table where the incident point falls within the zip code polygon.