US Census Geocoder¶
(Unofficial) Python Binding for the US Census Geocoder API
Branch |
Unit Tests |
---|---|
The US Census Geocoder is a Python library that provides Python bindings for the U.S. Census Geocoder API. It enables you to use simple Python function calls to retrieve Python object representations of geographic meta-data for the addresses or coordinates that you are searching for.
Warning
The US Census Geocoder is completely unofficial, and is in no way affiliated with the US Government or the US Census Bureau. We strongly recommend that you do business with them directly as needed, and simply provide this Python library as a facilitator for your programmatic interactions with the excellent services provided by the US Census Bureau.
Contents
Installation¶
To install the US Census Geocoder, just execute:
$ pip install census-geocoder
Dependencies¶
Validator-Collection v1.5.0 or higher
Backoff-Utils v1.0.1 or higher
Requests v2.26 or higher
Why the Census Geocoder?¶
In fulfilling its constitutional and statutory obligations, the US Census Bureau provides extensive data about the United States. They make this data available publicly through their website, through their raw data files, and through their APIs. However, while their public APIs provide great data, they are limited in both tooling and documentation. So to help with that, we’ve created the US Census Geocoder library.
The Census Geocoder library is designed to provide a Pythonic interface for interacting with the Census Bureau’s Geocoder API. It is specifically designed to eliminate the scaffolding needed to query the API directly, and provides for simpler and cleaner function calls to return forward geocoding and reverse geocoding information. Furthermore, it exposes Python object representations of the outputs returned by the API making it easy to work with the API’s data in your applications.
Key Census Geocoder Features¶
Easy to adopt. Just install and import the library, and you can be forward geocoding and reverse geocoding with just two lines of code.
Extensive documentation. One of the main limitations of the Geocoder API is that its documentation is scattered across the different datasets released by the Census Bureau, making it hard to navigate and understand. We’ve tried to fix that.
Location Search
Using Geographic Coordinates (reverse geocoding)
Using a One-line Address
Using a Parametrized Address
Using Batched Addresses
Geography Search
Using Geographic Coordinates (reverse geocoding)
Using a One-line Address
Using a Parametrized Address
Using Batched Addresses
Supports all available benchmarks, vintages, and layers.
Simplified syntax for indicating benchmarks, vintages, and layers.
No more hard to interpret field names. The library uses simplified (read: human understandable) names for location and geography properties.
The US Census Geocoder vs Alternatives¶
While we’re partial to the US Census Geocoder as our primary means of interacting with the Census Geocoder API, there are obviously alternatives for you to consider. Some might be better for your use specific use cases, so here’s how we think about them:
The Census Geocoder API is a straightforward RESTful API. Which means that you can just execute your own HTTP requests against it, retrieve the JSON results, and work with the resulting data entirely yourself. This is what I did for years, until I got tired of repeating the same patterns over and over again, and decided to build the Census Geocoder instead.
For a super-simple use case, probably the most expedient way to do it. But of course, more robust use cases would require your own scaffolding with built-in retry-logic, object representation, error handling, etc. which becomes non-trivial.
Why not use a library with batteries included?
Tip
When to use it?
In practice, I find that rolling my own solution is great when it’s an extremely simple use case, or a one-time operation (e.g. in a Jupyter Notebook) with no business logic to speak of. It’s a “quick-and-dirty” solution, where I’m trading rapid implementation (yay!) for less flexibility/functionality (boo!).
Considering how easy the Census Geocoder is to use, however, I find that I never really roll my own scaffolding when working with the Census Geocoder API.
The Census Geocode library is fantastic, and it was what I had used before building the Census Geocoder library. However, it has a number of significant limitations when compared to the US Census Geocoder:
Results are returned as-is from the Census Geocoder API. This means that:
Results are essentially JSON objects represented as
dict
, which makes interacting with them in Python a little more cumbersome (one has to navigate nesteddict
objects).Property/field names are as in the original Census data. This means that if you do not have the documentation handy, it is hard to intuitively understand what the data represents.
The library is licensed under GPL3, which may complicate or limit its utilization in commercial or closed-source software operating under different (non-GPL) licenses.
The library requires you to remember / apply a lot of the internals of the Census Geocoder API as-is (e.g. benchmark vintages) which is complicated given the API’s limited documentation.
The library does not support custom layers, and only returns the default set of layers for any request.
The Census Geocoder explicitly addresses all of these concerns:
The library uses native Python classes to represent results, providing a more pythonic syntax for interacting with those classes.
Properties / fields have been renamed to more human-understandable names.
The Census Geocoder is made available under the more flexible MIT License.
The library streamlines the configuration of benchmarks and vintages, and provides extensive documentation.
The library supports any and all layers supported by the Census Geocoder API.
Tip
When to use it?
Census Geocode has one advantage over the US Census Geocoder: It has a CLI.
I haven’t found much use for a CLI in the work I’ve done with the Census Geocoder API, so have not implemented it in the US Census Geocoder. Might add it in the future, if there are enough feature requests for it.
Given the above, it may be worth using Census Geocode instead of the Census Geocoder if you expect to be using a CLI.
The CensusBatchGeocoder is a fantastic library produced by the team at the Los Angeles Times Data Desk. It is specifically designed to provide a fairly pythonic interface for doing bulk geocoding operations, with great pandas serialization / de-serialization support.
However, it does have a couple of limitations:
Stale / Unmaintained? The library does not seem to have been updated since 2017, leading me to believe that it is stale and unmaintained. There are numerous open issues dating back to 2020, 2018, and 2017 that have seen no activity.
No benchmark/vintage/layer support. The library does not support the configuration of benchmarks, vintages, or layers.
Limited error handling. The library has somewhat limited error handling, judging by the issues that have been reported in the repository.
Optimized for bulk operations. The design of the library has been optimized for geocoding in bulk, which makes transactional one-off requests cumbersome to execute.
The Census Geocoder is obviously fresh / maintained, and has explicitly implemented robust error handling, and support for benchmarks, vintages, and layers. It is also designed to support bulk operations and transactional one-off requests.
Tip
When to use it?
CensusBatchGeocoder has one advantage over the US Census Geocoder: It can
serialize results to a pandas DataFrame
seamlessly and simply.
This is a useful feature, and one that I have added/pinned for the US Census Geocoder. If there are enough requests / up-votes on the issue, I may extend the library with this support in the future.
Given all this, it may be worth using CensusBatchGeocoder instead of the US Census Geocoder if you expect to be doing a lot of bulk operations using the default benchmark/vintage/layers.
geocoder and geopy are two of my favorite geocoding libraries in the Python ecosystem. They are both inherently pythonic, elegant, easy to use, and support most of the major geocoding providers out there with a standardized / unified API.
So at first blush, one might think: Why not just use one of these great libraries to handle requests against the Census Geocoder API?
Well, the problem is that neither geocoder nor geopy supports the Census Geocoder API as a geocoding provider. So…you can’t just use either of them if you specifically want US Census geocoding data.
Secondly, both the geocoder and geopy libraries are optimized around providing coordinates and feature information (e.g. matched address), which the Census Geocoder API results go beyond (and are not natively compatible with).
So really, if you want to interact with the Census Geocoder API, the Census Geocoder library is designed to do exactly that.
Hello World and Basic Usage¶
1. Import the Census Geocoder¶
import census_geocoder as geocoder
2. Execute a Coding Request¶
Using a One-line Address¶
location = geocoder.location.from_address('4600 Silver Hill Rd, Washington, DC 20233')
geography = geocoder.geography.from_address('4600 Silver Hill Rd, Washington, DC 20233')
Using a Parametrized Address¶
location = geocoder.location.from_address(street_1 = '4600 Silver Hill Rd',
city = 'Washington',
state = 'DC',
zip_code = '20233')
geography = geocoder.geography.from_address(street_1 = '4600 Silver Hill Rd',
city = 'Washington',
state = 'DC',
zip_code = '20233')
Using Batched Addresses¶
# Via a CSV File
location = geocoder.location.from_batch('my-batched-address-file.csv')
geography = geocoder.geography.from_batch('my-batched-address-file.csv')
Using Coordinates¶
location = geocoder.location.from_coordinates(latitude = 38.845985,
longitude = -76.92744)
geography = geocoder.geography.from_coordinates(latitude = 38.845985,
longitude = -76.92744)
3. Work with the Results¶
Work with Python Objects¶
location.matched_addresses[0].address
>> 4600 SILVER HILL RD, WASHINGTON, DC 20233
Questions and Issues¶
You can ask questions and report issues on the project’s Github Issues Page
Contributing¶
We welcome contributions and pull requests! For more information, please see the Contributor Guide. And thanks to all those who’ve already contributed:
Chris Modzelewski (@insightindustry)
Testing¶
We use TravisCI for our build automation and ReadTheDocs for our documentation.
Detailed information about our test suite and how to run tests locally can be found in our Testing Reference.
License¶
The Census Geocoder is made available under an MIT License.