Using the US Census Geocoder¶
Introduction¶
What is Geocoding?¶
Hint
The act of determining a specific, canonical location based on some input data.
See also
What we typically know about a specific location or geographical area is fuzzy. We might know part of the address, or refer to the address with abbreviations, or describe a general area, etc. It’s ambiguous, fuzzy, and unclear. That makes getting specific, canonical, and precise data about that geographic location challenging. Which is where the process of geocoding comes into play.
Geocoding is the process of getting a specific, precise, and canonical determination of a geographical location (a place or geographic feature) or of a geographical area (encompassing multiple places or geographic features).
A canonical determination of a geographical location or geographical area is defined by the meta-data that is returned for that location/area. Things like the canonical address, or various characteristics of the geographical area, etc. represent the “canonical” information about that location / area.
The process of geocoding returns exactly that kind of canonical / official / unambiguous meta-data about one or more geographical locations and areas based on a set of inputs. Some inputs may be expected to be imprecise or partial (e.g. addresses, typically used for forward geocoding) while others are expected to be precise but with incomplete information (e.g. longitude and latitude coordinates used in reverse geocoding).
Why the Census Geocoder?¶
Geocoding is used for many thing, but the Census Geocoder API in particular is meant to provide the US Census Bureau’s canonical meta-data about identified locations and areas. This meta-data is then typically used when executing more in-depth analysis on data published by the US Census Bureau and other departments of the US federal and state governments.
Because the US government uses a very complicated and overlapping hierarchy of geographic areas, it is essential when working with US government data to start from the precise identification of the geographic areas and locations of interest.
But using the Census Geocoder API to get this information is non-trivial in its complexity. That’s both because the API has limited documentation on the one hand, and because its syntax is non-pythonic and requires extensive familiarity with the internals of the (complicated) datasets that the US Census Bureau manages/publishes.
The Census Geocoder library is meant to simplify all of that, by providing an easy-to-use, batteries-included, pythonic wrapper around the Census Geocoder API.
Census Geocoder vs. Alternatives¶
While we’re partial to the US Census Geocoder as our primary means of interacting with the Census Geocoder API, there are obviously alternatives for you to consider. Some might be better for your use specific use cases, so here’s how we think about them:
The Census Geocoder API is a straightforward RESTful API. Which means that you can just execute your own HTTP requests against it, retrieve the JSON results, and work with the resulting data entirely yourself. This is what I did for years, until I got tired of repeating the same patterns over and over again, and decided to build the Census Geocoder instead.
For a super-simple use case, probably the most expedient way to do it. But of course, more robust use cases would require your own scaffolding with built-in retry-logic, object representation, error handling, etc. which becomes non-trivial.
Why not use a library with batteries included?
Tip
When to use it?
In practice, I find that rolling my own solution is great when it’s an extremely simple use case, or a one-time operation (e.g. in a Jupyter Notebook) with no business logic to speak of. It’s a “quick-and-dirty” solution, where I’m trading rapid implementation (yay!) for less flexibility/functionality (boo!).
Considering how easy the Census Geocoder is to use, however, I find that I never really roll my own scaffolding when working with the Census Geocoder API.
The Census Geocode library is fantastic, and it was what I had used before building the Census Geocoder library. However, it has a number of significant limitations when compared to the US Census Geocoder:
Results are returned as-is from the Census Geocoder API. This means that:
Results are essentially JSON objects represented as
dict
, which makes interacting with them in Python a little more cumbersome (one has to navigate nesteddict
objects).Property/field names are as in the original Census data. This means that if you do not have the documentation handy, it is hard to intuitively understand what the data represents.
The library is licensed under GPL3, which may complicate or limit its utilization in commercial or closed-source software operating under different (non-GPL) licenses.
The library requires you to remember / apply a lot of the internals of the Census Geocoder API as-is (e.g. benchmark vintages) which is complicated given the API’s limited documentation.
The library does not support custom layers, and only returns the default set of layers for any request.
The Census Geocoder explicitly addresses all of these concerns:
The library uses native Python classes to represent results, providing a more pythonic syntax for interacting with those classes.
Properties / fields have been renamed to more human-understandable names.
The Census Geocoder is made available under the more flexible MIT License.
The library streamlines the configuration of benchmarks and vintages, and provides extensive documentation.
The library supports any and all layers supported by the Census Geocoder API.
Tip
When to use it?
Census Geocode has one advantage over the US Census Geocoder: It has a CLI.
I haven’t found much use for a CLI in the work I’ve done with the Census Geocoder API, so have not implemented it in the US Census Geocoder. Might add it in the future, if there are enough feature requests for it.
Given the above, it may be worth using Census Geocode instead of the Census Geocoder if you expect to be using a CLI.
The CensusBatchGeocoder is a fantastic library produced by the team at the Los Angeles Times Data Desk. It is specifically designed to provide a fairly pythonic interface for doing bulk geocoding operations, with great pandas serialization / de-serialization support.
However, it does have a couple of limitations:
Stale / Unmaintained? The library does not seem to have been updated since 2017, leading me to believe that it is stale and unmaintained. There are numerous open issues dating back to 2020, 2018, and 2017 that have seen no activity.
No benchmark/vintage/layer support. The library does not support the configuration of benchmarks, vintages, or layers.
Limited error handling. The library has somewhat limited error handling, judging by the issues that have been reported in the repository.
Optimized for bulk operations. The design of the library has been optimized for geocoding in bulk, which makes transactional one-off requests cumbersome to execute.
The Census Geocoder is obviously fresh / maintained, and has explicitly implemented robust error handling, and support for benchmarks, vintages, and layers. It is also designed to support bulk operations and transactional one-off requests.
Tip
When to use it?
CensusBatchGeocoder has one advantage over the US Census Geocoder: It can
serialize results to a pandas DataFrame
seamlessly and simply.
This is a useful feature, and one that I have added/pinned for the US Census Geocoder. If there are enough requests / up-votes on the issue, I may extend the library with this support in the future.
Given all this, it may be worth using CensusBatchGeocoder instead of the US Census Geocoder if you expect to be doing a lot of bulk operations using the default benchmark/vintage/layers.
geocoder and geopy are two of my favorite geocoding libraries in the Python ecosystem. They are both inherently pythonic, elegant, easy to use, and support most of the major geocoding providers out there with a standardized / unified API.
So at first blush, one might think: Why not just use one of these great libraries to handle requests against the Census Geocoder API?
Well, the problem is that neither geocoder nor geopy supports the Census Geocoder API as a geocoding provider. So…you can’t just use either of them if you specifically want US Census geocoding data.
Secondly, both the geocoder and geopy libraries are optimized around providing coordinates and feature information (e.g. matched address), which the Census Geocoder API results go beyond (and are not natively compatible with).
So really, if you want to interact with the Census Geocoder API, the Census Geocoder library is designed to do exactly that.
Census Geocoder Features¶
Easy to adopt. Just install and import the library, and you can be forward geocoding and reverse geocoding with just two lines of code.
Extensive documentation. One of the main limitations of the Geocoder API is that its documentation is scattered across the different datasets released by the Census Bureau, making it hard to navigate and understand. We’ve tried to fix that.
Location Search
Using Geographic Coordinates (reverse geocoding)
Using a One-line Address
Using a Parametrized Address
Using Batched Addresses
Geography Search
Using Geographic Coordinates (reverse geocoding)
Using a One-line Address
Using a Parametrized Address
Using Batched Addresses
Supports all available benchmarks, vintages, and layers.
Simplified syntax for indicating benchmarks, vintages, and layers.
No more hard to interpret field names. The library uses simplified (read: human understandable) names for location and geography properties.
Overview¶
How the Census Geocoder Works¶
The Census Geocoder works with the Census Geocoder API by providing a thin Python wrapper around the APIs functionality. Rather than having to construct your own HTTP requests against the API itself, you can instead work with Python objects and functions the way you normally would.
In other words, the process is very straightforward:
Install the Census Geocoder library. (see here)
Import the geocoder. (see here)
Geocode something - either locations or geographies. (see here)
Work with your geocoded locations or geographical areas. (see here)
And that’s it! Once you’ve done the steps above, you can easily geocode one-off requests or batch many requests into a single transaction.
1. Installing the Census Geocoder¶
To install the US Census Geocoder, just execute:
$ pip install census-geocoder
Dependencies¶
Validator-Collection v1.5.0 or higher
Backoff-Utils v1.0.1 or higher
Requests v2.26 or higher
2. Import the Census Geocoder¶
Importing the Census Geocoder is very straightforward. You can either import its components precisely (see API Reference) or simply import the entire module:
# Import the entire module.
import census_geocoder as geocoder
result = geocoder.location.from_address('4600 Silver Hill Rd, Washington, DC 20233')
result = geocoder.geography.from_address('4600 Silver Hill Rd, Washington, DC 20233')
# Import precise components.
from census_geocoder import Location, Geography
result = Location.from_address('4600 Silver Hill Rd, Washington, DC 20233')
result = Geography.from_address('4600 Silver Hill Rd, Washington, DC 20233')
3. Geocoding¶
Geocoding a location means to retrieve canonical meta-data about that location. Think of it as getting the “official” details for a given place. Using the Census Geocoder, you can geocode locations given:
A single-line address (whole or partial)
A parametrized address where you know its components parts
A set of longitude and latitude coordinates
A batch file in CSV or TXT format
However, the Census Geocoder API provides two different sets of meta-data for any canonical location:
Location Data. Think of it as the canonical address for a given location/place.
Geographic Area Data. Think of it as canonical information about the (different) areas that contain the given location/place.
Using the Census Geocoder library you can retrieve both types of information.
Hint
When retrieving geographic area data, you also get location data.
Getting Location Data¶
Retrieving data about canonical locations is very straightforward. You have four different ways to get this information, depending on what information you have about the location you want to geocode:
import census_geocoder as geocoder
result = geocoder.location.from_address('4600 Silver Hill Rd, Washington, DC 20233')
See also
import census_geocoder as geocoder
result = geocoder.location.from_address(street = '4600 Silver Hill Rd',
city = 'Washington',
state = 'DC',
zip_code = '20233')
See also
import census_geocoder as geocoder
result = geocoder.location.from_coordinates(longitude = -76.92744,
latitude = 38.845985)
See also
import census_geocoder as geocoder
result = geocoder.location.from_batch(file_ = '/my-csv-file.csv')
Caution
The batch file indicated can have a maximum of 10,000 records.
Warning
While the Census Geocoder API supports CSV, TXT, XLSX, and DAT formats the Census Geocoder library only supports CSV and TXT formats so as to avoid dependency-bloat (read: Why rely on other libraries to read XLSX format data?).
See also
Getting Geographic Area Data¶
Retrieving data about the geographic areas that contain a given location/place is just
as straightforward as getting location data. In fact, the
syntax is almost identical. Just swap out the word 'location'
for 'geography'
and you’re done!
Here’s how to do it:
import census_geocoder as geocoder
result = geocoder.geography.from_address('4600 Silver Hill Rd, Washington, DC 20233')
See also
import census_geocoder as geocoder
result = geocoder.geography.from_address(street = '4600 Silver Hill Rd',
city = 'Washington',
state = 'DC',
zip_code = '20233')
See also
import census_geocoder as geocoder
result = geocoder.geography.from_coordinates(longitude = -76.92744,
latitude = 38.845985)
See also
import census_geocoder as geocoder
result = geocoder.geography.from_batch(file_ = '/my-csv-file.csv')
Caution
The batch file indicated can have a maximum of 10,000 records.
Warning
While the Census Geocoder API supports CSV, TXT, XLSX, and DAT formats the Census Geocoder library only supports CSV and TXT formats so as to avoid dependency-bloat (read: Why rely on other libraries to read XLSX format data?).
See also
Benchmarks and Vintages¶
The data returned by the Census Geocoder API is different from typical geocoding services, in that it is time-sensitive. A geocoding service like the Google Maps API or Here.com only cares about the current location. But the US Census Bureau’s information is inherently linked to the statistical data collected by the US Census Bureau at particular moments in time.
Thus, when making requests against the Census Geocoder API you are always asking for geographic location data or geographic area data as of a particular date. You might think “geographies don’t change”, but in actuality they are constantly evolving. Congressional districts, school districts, town lines, county lines, street names, house numbers, etc. are all constantly evolving. And to ensure that the statistical data is tied to the locations properly, that alignment needs to be maintained through two key concepts:
The benchmark is the time period when geographic information was snapshotted for use / publication in the Census Geocoder API. This is typically done twice per year, and represents the “geographic definitions as of the time period indicated by the benchmark”.
The vintage is the census or survey data that the geographies are linked to. Thus, the geographic identifiers or statistical data associated with locations or geographic areas within a given benchmark are also linked to a particular vintage of census/survey data. Trying to use those identifiers or statistical data with a different vintage of data may produce inaccurate results.
The Census Geocoder API supports a variety of benchmarks and vintages, and they are unfortunately poorly documented and difficult to interpret. Therefore, the Census Geocoder has been designed to streamline and simplify their usage.
Vintages are only available for a given benchmark. The table below provides guidance on the vintages and benchmarks supported by the Census Geocoder:
BENCHMARKS |
||
---|---|---|
Current |
Census2020 |
|
VINTAGES |
Current |
Census2020 |
Census2020 |
Census2010 |
|
ACS2019 |
||
ACS2018 |
||
ACS2017 |
||
Census2010 |
When using the Census Geocoder, you can supply the benchmark and vintage directly when executing your geocoding request:
import census_geocoder as geocoder
result = geocoder.location.from_address('4600 Silver Hill Rd, Washington, DC 20233',
benchmark = 'Current',
vintage = 'ACS2019')
result = geocoder.geography.from_address('4600 Silver Hill Rd, Washington, DC 20233',
benchmark = 'Current',
vintage = 'ACS2019')
import census_geocoder as geocoder
result = geocoder.location.from_address(street = '4600 Silver Hill Rd',
city = 'Washington',
state = 'DC',
zip_code = '20233',
benchmark = 'Current',
vintage = 'ACS2019')
result = geocoder.geography.from_address(street = '4600 Silver Hill Rd',
city = 'Washington',
state = 'DC',
zip_code = '20233',
benchmark = 'Current',
vintage = 'ACS2019')
import census_geocoder as geocoder
result = geocoder.location.from_coordinates(longitude = -76.92744,
latitude = 38.845985,
benchmark = 'Current',
vintage = 'ACS2019')
result = geocoder.geography.from_coordinates(longitude = -76.92744,
latitude = 38.845985,
benchmark = 'Current',
vintage = 'ACS2019')
import census_geocoder as geocoder
result = geocoder.location.from_batch(file_ = '/my-csv-file.csv',
benchmark = 'Current',
vintage = 'ACS2019')
result = geocoder.geography.from_batch(file_ = '/my-csv-file.csv',
benchmark = 'Current',
vintage = 'ACS2019')
Hint
Several important things to be aware of when it comes to benchmarks and vintages in the Census Geocoder library:
Unless over-ridden by the CENSUS_GEOCODER_BENCHMARK
or CENSUS_GEOCODER_VINTAGE
environment variables, the benchmark and vintage default to 'Current'
and
'Current'
respectively.
The benchmark and vintage are case-insensitive. This means that you can supply
'Current'
, 'CURRENT'
, or 'current'
and it will all work the same.
If you want to set a different default benchmark or vintage, you can do so by setting
CENSUS_GEOCODER_BENCHMARK
and CENSUS_GEOCODER_VINTAGE
environment variables
to the defaults you want to use.
Layers¶
When working with the Census Geocoder API (particularly when getting geographic area data), you have the ability to control which types of geographic area get returned. These types of geographic area are called “layers”.
An example of two different “layers” might be “State” and “County”. These are two different types of geographic area, one of which (County) may be encompassed by the other (State). In general, geographic areas within the same layer cannot and do not overlap. However different layers can and do overlap, where one layer (State) may contain multiple other layers (Counties), or one layer (Metropolitan Statistical Areas) may partially overlap multiple entities within a different layer (States).
When using the Census Geocoder you can easily specify the layers of data that you
want returned. Unless overridden by the CENSUS_GEOCODER_LAYERS
environment variable,
the layers returned will always default to 'all'
.
Which layers are available is ultimately determined by the vintage of the data you are retrieving. The following represents the list of layers available in each vintage:
Current
2010 Census Public Use Microdata Areas
2010 Census PUMAs
2010 PUMAs
Census Public Use Microdata Areas
Census PUMAs
PUMAs
2020 Census ZIP Code Tabulation Areas
2020 Census ZCTAs
Census ZCTAs
ZCTAs
Tribal Census Tracts
Tribal Block Groups
Census Tracts
Census Block Groups
2020 Census Blocks
Census Blocks
Blocks
Unified School Districts
Secondary School Districts
Elementary School Districts
Estates
County Subdivisions
Subbarrios
Consolidated Cities
Incorporated Places
Census Designated Places
CDPs
Alaska Native Regional Corporations
Tribal Subdivisions
Federal American Indian Reservations
Off-Reservation Trust Lands
State American Indian Reservations
Hawaiian Home Lands
Alaska Native Village Statistical Areas
Oklahoma Tribal Statistical Areas
State Designated Tribal Stastical Areas
Tribal Designated Statistical Areas
American Indian Joint-Use Areas
116th Congressional Districts
Congressional Districts
2018 State Legislative Districts - Upper
State Legislative Districts - Upper
2018 State Legislative Districts - Lower
State Legislative Districts - Lower
Census Divisions
Divisions
Census Regions
Regions
Combined New England City and Town Areas
Combined NECTAs
New England City and Town Area Divisions
NECTA Divisions
Metropolitan New England City and Town Areas
Metropolitan NECTAs
Micropolitan New England City and Town Areas
Micropolitan NECTAs
Combined Statistical Areas
CSAs
Metropolitan Divisions
Metropolitan Statistical Areas
Micropolitan Statistical Areas
States
Counties
Census2020
Urban Growth Areas
Tribal Census Tracts
Tribal Block Groups
Census Tracts
Census Block Groups
Block Groups
Census Blocks
Blocks
Unified School Districts
Secondary School Districts
Elementary School Districts
Estates
County Subdivisions
Subbarrios
Consolidated Cities
Incorporated Places
Census Designated Places
CDPs
Alaska Native Regional Corporations
Tribal Subdivisions
Federal American Indian Reservations
Off-Reservation Trust Lands
State American Indian Reservations
Hawaiian Home Lands
Alaska Native Village Statistical Areas
Oklahoma Tribal Statistical Areas
State Designated Tribal Stastical Areas
Tribal Designated Statistical Areas
American Indian Joint-Use Areas
116th Congressional Districts
Congressional Districts
2018 State Legislative Districts - Upper
State Legislative Districts - Upper
2018 State Legislative Districts - Lower
State Legislative Districts - Lower
Voting Districts
Census Divisions
Divisions
Census Regions
Regions
Combined New England City and Town Areas
Combined NECTAs
New England City and Town Area Divisions
NECTA Divisions
Metropolitan New England City and Town Areas
Metropolitan NECTAs
Micropolitan New England City and Town Areas
Micropolitan NECTAs
Combined Statistical Areas
CSAs
Metropolitan Divisions
Metropolitan Statistical Areas
Micropolitan Statistical Areas
States
Counties
Zip Code Tabulation Areas
ZCTAs
ACS2019
2010 Census Public Use Microdata Areas
2010 Census PUMAs
2010 PUMAs
Census Public Use Microdata Areas
Census PUMAs
PUMAs
2010 Census ZIP Code Tabulation Areas
2010 Census ZCTAs
Census ZCTAs
ZCTAs
Tribal Census Tracts
Tribal Block Groups
Census Tracts
Census Block Groups
Unified School Districts
Secondary School Districts
Elementary School Districts
Estates
County Subdivisions
Subbarrios
Consolidated Cities
Incorporated Places
Census Designated Places
CDPs
Alaska Native Regional Corporations
Tribal Subdivisions
Federal American Indian Reservations
Off-Reservation Trust Lands
State American Indian Reservations
Hawaiian Home Lands
Alaska Native Village Statistical Areas
Oklahoma Tribal Statistical Areas
State Designated Tribal Stastical Areas
Tribal Designated Statistical Areas
American Indian Joint-Use Areas
116th Congressional Districts
Congressional Districts
2018 State Legislative Districts - Upper
State Legislative Districts - Upper
2018 State Legislative Districts - Lower
State Legislative Districts - Lower
Census Divisions
Divisions
Census Regions
Regions
2010 Census Urbanized Areas
Census Urbanized Areas
Urbanized Areas
2010 Census Urban Clusters
Census Urban Clusters
Urban Clusters
Combined New England City and Town Areas
Combined NECTAs
New England City and Town Area Divisions
NECTA Divisions
Metropolitan New England City and Town Areas
Metropolitan NECTAs
Micropolitan New England City and Town Areas
Micropolitan NECTAs
Combined Statistical Areas
CSAs
Metropolitan Divisions
Metropolitan Statistical Areas
Micropolitan Statistical Areas
States
Counties
ACS2018
2010 Census Public Use Microdata Areas
2010 Census PUMAs
2010 PUMAs
Census Public Use Microdata Areas
Census PUMAs
PUMAs
2010 Census ZIP Code Tabulation Areas
2010 Census ZCTAs
Census ZCTAs
ZCTAs
Tribal Census Tracts
Tribal Block Groups
Census Tracts
Census Block Groups
Unified School Districts
Secondary School Districts
Elementary School Districts
Estates
County Subdivisions
Subbarrios
Consolidated Cities
Incorporated Places
Census Designated Places
CDPs
Alaska Native Regional Corporations
Tribal Subdivisions
Federal American Indian Reservations
Off-Reservation Trust Lands
State American Indian Reservations
Hawaiian Home Lands
Alaska Native Village Statistical Areas
Oklahoma Tribal Statistical Areas
State Designated Tribal Stastical Areas
Tribal Designated Statistical Areas
American Indian Joint-Use Areas
116th Congressional Districts
Congressional Districts
2018 State Legislative Districts - Upper
State Legislative Districts - Upper
2018 State Legislative Districts - Lower
State Legislative Districts - Lower
Census Divisions
Divisions
Census Regions
Regions
2010 Census Urbanized Areas
Census Urbanized Areas
Urbanized Areas
2010 Census Urban Clusters
Census Urban Clusters
Urban Clusters
Combined New England City and Town Areas
Combined NECTAs
New England City and Town Area Divisions
NECTA Divisions
Metropolitan New England City and Town Areas
Metropolitan NECTAs
Micropolitan New England City and Town Areas
Micropolitan NECTAs
Combined Statistical Areas
CSAs
Metropolitan Divisions
Metropolitan Statistical Areas
Micropolitan Statistical Areas
States
Counties
ACS2017
2010 Census Public Use Microdata Areas
2010 Census PUMAs
2010 PUMAs
Census Public Use Microdata Areas
Census PUMAs
PUMAs
2010 Census ZIP Code Tabulation Areas
2010 Census ZCTAs
Census ZCTAs
ZCTAs
Tribal Census Tracts
Tribal Block Groups
Census Tracts
Census Block Groups
Unified School Districts
Secondary School Districts
Elementary School Districts
Estates
County Subdivisions
Subbarrios
Consolidated Cities
Incorporated Places
Census Designated Places
CDPs
Alaska Native Regional Corporations
Tribal Subdivisions
Federal American Indian Reservations
Off-Reservation Trust Lands
State American Indian Reservations
Hawaiian Home Lands
Alaska Native Village Statistical Areas
Oklahoma Tribal Statistical Areas
State Designated Tribal Stastical Areas
Tribal Designated Statistical Areas
American Indian Joint-Use Areas
115th Congressional Districts
Congressional Districts
2016 State Legislative Districts - Upper
State Legislative Districts - Upper
2016 State Legislative Districts - Lower
State Legislative Districts - Lower
Census Divisions
Divisions
Census Regions
Regions
2010 Census Urbanized Areas
Census Urbanized Areas
Urbanized Areas
2010 Census Urban Clusters
Census Urban Clusters
Urban Clusters
Combined New England City and Town Areas
Combined NECTAs
New England City and Town Area Divisions
NECTA Divisions
Metropolitan New England City and Town Areas
Metropolitan NECTAs
Micropolitan New England City and Town Areas
Micropolitan NECTAs
Combined Statistical Areas
CSAs
Metropolitan Divisions
Metropolitan Statistical Areas
Micropolitan Statistical Areas
States
Counties
Census2010
Public Use Microdata Areas
PUMAs
Traffic Analysis Districts
TADs
Traffic Analysis Zones
TAZs
Urban Growth Areas
ZIP Code Tabulation Areas
Zip Code Tabulation Areas
ZCTAs
Tribal Census Tracts
Tribal Block Groups
Census Tracts
Census Block Groups
Census Blocks
Blocks
Unified School Districts
Secondary School Districts
Elementary School Districts
Estates
County Subdivisions
Subbarrios
Consolidated Cities
Incorporated Places
Census Designated Places
CDPs
Alaska Native Regional Corporations
Tribal Subdivisions
Federal American Indian Reservations
Off-Reservation Trust Lands
State American Indian Reservations
Hawaiian Home Lands
Alaska Native Village Statistical Areas
Oklahoma Tribal Statistical Areas
State Designated Tribal Stastical Areas
Tribal Designated Statistical Areas
American Indian Joint-Use Areas
113th Congressional Districts
111th Congressional Districts
2012 State Legislative Districts - Upper
2012 State Legislative Districts - Lower
2010 State Legislative Districts - Upper
2010 State Legislative Districts - Lower
Voting Districts
Census Divisions
Divisions
Census Regions
Regions
Urbanized Areas
Urban Clusters
Combined New England City and Town Areas
Combined NECTAs
New England City and Town Area Divisions
NECTA Divisions
Metropolitan New England City and Town Areas
Metropolitan NECTAs
Micropolitan New England City and Town Areas
Micropolitan NECTAs
Combined Statistical Areas
CSAs
Metropolitan Divisions
Metropolitan Statistical Areas
Micropolitan Statistical Areas
States
Counties
Note
You may notice that there are (logical) duplicate layers in the lists above, for example “2010 Census PUMAs” and “2010 Census Public Use Microdata Areas”. This is because there are multiple ways that users of Census data may refer to particular layers in their work. This duplication is purely for the convenience of Census Geocoder users, since the Census Geocoder API actually uses numerical identifiers for the layers returned.
When geocoding data, you can simply supply the layers you want using the layers
keyword argument as below:
import census_geocoder as geocoder
result = geocoder.location.from_address('4600 Silver Hill Rd, Washington, DC 20233',
benchmark = 'Current',
vintage = 'ACS2019',
layers = 'Census Tracts, States, CDPs, Divisions')
result = geocoder.geography.from_address('4600 Silver Hill Rd, Washington, DC 20233',
benchmark = 'Current',
vintage = 'ACS2019',
layers = 'Census Tracts, States, CDPs, Divisions')
import census_geocoder as geocoder
result = geocoder.location.from_address(street = '4600 Silver Hill Rd',
city = 'Washington',
state = 'DC',
zip_code = '20233',
benchmark = 'Current',
vintage = 'ACS2019',
layers = 'Census Tracts, States, CDPs, Divisions')
result = geocoder.geography.from_address(street = '4600 Silver Hill Rd',
city = 'Washington',
state = 'DC',
zip_code = '20233',
benchmark = 'Current',
vintage = 'ACS2019',
layers = 'Census Tracts, States, CDPs, Divisions')
import census_geocoder as geocoder
result = geocoder.location.from_coordinates(longitude = -76.92744,
latitude = 38.845985,
benchmark = 'Current',
vintage = 'ACS2019',
layers = 'Census Tracts, States, CDPs, Divisions')
result = geocoder.geography.from_coordinates(longitude = -76.92744,
latitude = 38.845985,
benchmark = 'Current',
vintage = 'ACS2019',
layers = 'Census Tracts, States, CDPs, Divisions')
import census_geocoder as geocoder
result = geocoder.location.from_batch(file_ = '/my-csv-file.csv',
benchmark = 'Current',
vintage = 'ACS2019')
result = geocoder.geography.from_batch(file_ = '/my-csv-file.csv',
benchmark = 'Current',
vintage = 'ACS2019',
layers = 'Census Tracts, States, CDPs, Divisions')
Hint
When using the Census Geocoder to return geographic area data, you can request multiple layers worth of data by passing them in a comma-delimited string. This will return separate data for each layer indicated. The comma-delimited string can include white-space for easy readability, which means that the following two values are considered identical:
layers = 'Census Tracts, States, CDPs, Divisions'
layers = 'Census Tracts,States,CDPs,Divisions'
To retrieve all available layers that have data for a given location, you can submit
'all'
. Unless you have set the CENSUS_GEOCODER_LAYERS
environment variable to a
different value, 'all'
is the default set of layers that will be returned.
Note that layer names in the Census Geocoder are case-insensitive.
4. Working with Results¶
Now that you’ve geocoded some data using the Census Geocoder, you probably want to work with your data. Well, that’s pretty easy since the Census Geocoder returns native Python objects containing your location or geographical area data.
Location Data¶
When working with location data, there are two principle sets of meta-data made available:
Input. This is the input that was submitted to the Census Geocoder API, and it includes:
Matched Addresses. This is a collection of addresses that the Census Geocoder API returned as the canonical addresses for your inputs.
Each matched address exposes its key meta-data, including:
See also
Geographical Area Data¶
Geographical area data is always returned within the context of a
MatchedAddress
instance, which itself
is always contained within a Location
instance. That matched address will have a .geographies
property, which will contain a
GeographyCollection
. That
.geographies
property is what contains the detailed geographical area meta-data for
all geographical areas returned in response to your API request.
Each layer requested is contained in a property of the
GeographyCollection
. For
example, the relevant regions would be contained in the .regions
property, while
the relevant census tracts would be contained in the .tracts
property.
See also
For a full list of the properties/layers that are available within a
GeographyCollection
, please
see the detailed API reference:
If a layer is not requested (or is irrelevant for a given benchmark /
vintage), then its corresponding property in the
GeographyCollection
will
be None
.
Within each layer/property, you will find a collection of
Geography
instances (technically,
layer-specific sub-class instances). Each of these instances represents a geographical
area returned by the Census Geocoder API, and their properties will contain the
meta-data returned by that API.
Because different types of geographical area return different meta-data, there is a useful
.inspect()
method that will tell
you what meta-data properties are available / have data.
The most universal properties (and the ones that are going to prove most useful when working with other Census Bureau datasets) are:
.geoid
which contains the GEOID (unique consolidated identifier for the geographical area)
.name
which contains the human-readable name of the geographical area
.geography_type
which contains a human-readable label for the instances’s geographical area/layer type
.functional_status
which contains a human-readable indication of the geographical area’s functional status
See also