Monday, November 30, 2015

Calculating Distance between Zipcodes

If you have done some searching before arriving here, you may have realized that this task is not a simple as it looks on the surface. Surely there is an function or something? No. Not exactly.

[In SAS there apparently is such a built-in function]

Here's what I used to solve this problem:

1. Functions (3)
2. Table of Zipcodes with Longitude/Latitude

I obtained the functions used from this thread: http://www.tek-tips.com/viewthread.cfm?qid=801025

To recap, I created these three Functions:

CREATE OR REPLACE function DISTANCE
    (p1lon number
    ,p1lat number
    ,p2lon number
    ,p2lat number)
    return number
is
begin
    return    get_distance
        (conv_LongLat(p1lon,'LO','SM')
        ,conv_LongLat(p1lat,'LA','SM')
        ,conv_LongLat(p2lon,'LO','SM')
        ,conv_LongLat(p2lat,'LA','SM')
        );
end;
/

CREATE OR REPLACE function Conv_LongLat
    (LongLat    in number
    ,LongOrLat    in varchar2
    ,Conversion_Units in varchar2
    )
    return number
is
    LoLa        Char(2);
    Degree_in_output_units    number;
begin
    LoLa    := upper(substr(LongOrLat,1,2));
    if    LoLa not in ('LO','LA') then
        raise_application_error(-20000,
            'Error: Longitude/Latitude indicator, "'||LongOrLat||
            '", must begin with "LO" or "LA".');
    end if;
    if    LoLa = 'LO' then     -- Longitude
        if    LongLat > 180 then
            raise_application_error(-20001,
                'Error: Longitude ('||LongLat||') must <= 180 degrees.');
        end if;
        if upper(Conversion_Units) = 'M'     -- 'Meters' per degree 
            then Degree_in_output_units := 111303;
        elsif upper(Conversion_Units) = 'F'    -- 'Feet' per degree
            then Degree_in_output_units := 365166;
        elsif upper(Conversion_Units) = 'SM'    -- 'Statute Miles' per degree
            then Degree_in_output_units := 69.16022727272727;
        elsif upper(Conversion_Units) = 'NM'    -- 'Nautical Miles' per degree
            then Degree_in_output_units := 60.098594281230234;
        else
            raise_application_error(-20002,
            'Error: Conversion-Unit indicator must be "M","F","SM",or "NM"');
        end if;
    elsif    LoLa = 'LA' then    -- Latitude
        if    LongLat > 90 then
            raise_application_error(-20003,
                'Error: Latitude ('||LongLat||') must <= 90 degrees.');
        end if;
        if upper(Conversion_Units) = 'M'    -- 'Meters' per degree
            then Degree_in_output_units := 110575;
        elsif upper(Conversion_Units) = 'F'     -- 'Feet' per degree
            then Degree_in_output_units := 362778;
        elsif upper(Conversion_Units) = 'SM'    -- 'Statute Miles' per degree
            then Degree_in_output_units := 68.70795454545454;
        elsif upper(Conversion_Units) = 'NM'    -- 'Nautical Miles' per degree
            then Degree_in_output_units := 59.70558002704562;
        else
            raise_application_error(-20002,
            'Error: Conversion-Unit indicator must be "M","F","SM",or "NM"');
        end if;
    end if;
    Return    LongLat*Degree_in_output_units;
end;
/

CREATE OR REPLACE function DISTANCE
    (p1lon number
    ,p1lat number
    ,p2lon number
    ,p2lat number)
    return number
is
begin
    return    get_distance
        (conv_LongLat(p1lon,'LO','SM')
        ,conv_LongLat(p1lat,'LA','SM')
        ,conv_LongLat(p2lon,'LO','SM')
        ,conv_LongLat(p2lat,'LA','SM')
        );
end;
/

The Table didn't seem like a big deal but after getting in a looking at the data, I had to try a few sources to get one that worked well. The problem I ran into was the same Long/Lat information being used for surrounding zipcodes as the major city.

What I finally settled on was this Zip Code Database from Softwaretools.com at: https://softwaretools.com/zip-code-database

Loading the data took me a few tries using Oracle SQL*Developer since it never gets the column sizing just right it seems. Here's is what I ended up with:













CREATE TABLE ZIPCODES_SWT
(
  ZIPCODE     VARCHAR2(5 BYTE),
  STATE       VARCHAR2(2 BYTE),
  CITY        VARCHAR2(50 BYTE),
  COUNTY      VARCHAR2(25 BYTE),
  STATEFIPS   VARCHAR2(2 BYTE),
  COUNTYFIPS  VARCHAR2(3 BYTE),
  LATITUDE    VARCHAR2(8 BYTE),
  LONGITUDE   VARCHAR2(8 BYTE),
  PREFERENCE  VARCHAR2(1 BYTE),
  TYPE        VARCHAR2(3 BYTE)
)
TABLESPACE USERS
PCTUSED    0
PCTFREE    10
INITRANS   1
MAXTRANS   255
STORAGE    (
            INITIAL          160K
            NEXT             1M
            MAXSIZE          UNLIMITED
            MINEXTENTS       1
            MAXEXTENTS       UNLIMITED
            PCTINCREASE      0
            BUFFER_POOL      DEFAULT
           )
LOGGING 
NOCOMPRESS 
NOCACHE
MONITORING;

COMMENT ON TABLE ZIPCODES_SWT IS 'Data obtained at: https://softwaretools.com/zip-code-database on 11/30/2015.';

From here i was able to calculate someone's distance (hard-coding my Long/Lat in values 3 and 4) using:

SELECT ROUND(distance(-106.71, 35.14, -106.62, 35.11) AS DISTANCE
  FROM DUAL;


  DISTANCE
----------
6.55683725
1 row selected.

Or Join this cross reference in your query. Remember to filter these Table values: WHERE PREFERENCE = 'A'

Good luck!