Configuration Based Universal International Address Model Approach
In the first post in the series we discussed some goals for the a Universal International Address Model. These goals were:
- Implement a solution that works for any country
- Implement a standards based solution
- Model must be normalized
- Model should include lookup tables
- Model should be able to be governed and addresses verifiable by standard
Bonus Requirements:
- Model should support use within an application
- Model should be able to have country specific views generated for use in analytics
- Model should be able to handle 1-N levels of Recipient information
Implement a solution that works for any country
Blasphemy you shout as you throw your pen and notepad down on the table looking over this post. Well… I would have been right there with you probably 2 weeks ago. Let’s review what changed.
Configuration management is a big thing now, and when we look to it almost all solutions are based in some form or another off of yaml. So the question becomes why don’t we try to implement this approach for data, well that’s what we did. Here is the model:
The address format configuration is where we will store a JSON for each address format accepted by a country referenced by the ISO 3166-1 Standard. We could then use a database such as PostgreSQL or Snowflake to parse the JSON and use the information to return the fields from the table that are relevant to the specific address format we want, or we can build a view for each format using the JSON to drive the field selection.
Implement a standards based solution
In this model the ISO Country standards were employed as well as sourcing the wikipedia site on address formats. Sources include:
- ISO 3166-1 – Used for Country Code
- ISO 3166-2 – Used for Country Subdivision
- Wikipedia Address – Used to determine the individual attributes needed per Country and Country Address Standard
Model must be normalized
We can see in the attached model that we have normalized to the nth degree. As we move from conceptual to logical, the individual components required of each entity will be vetted and additional normalization will be employed as needed.
Model should include lookup tables
RDM is taken to a new level in this model which is initially (and correctly so for smaller implementations) viewed as overkill. For example storing street number as an address and having a separate table to store all the numbers seems unnecessary, but when you think it though is really cool as every field in every entity becomes reusable for any other standard using the same field.
Model should be able to be governed and addresses verifiable by standard
By having all components stored individually and a configuration file that defines what is included for a specific address, using the two in tandem we can pull addresses of any type for any country. Since we would be down to a specific address type, we can then use the controlling standard for the configuration to verify that all records are compliant and their quality.
Model should support use within an application
Most modern applications in some way shape or form deal with JSONs, especially with REST APIs, Java backends, and Full Stack JS applications, JSONs permeate almost every application to some degree. In fact this is such a popular option that development teams use databases like PostgreSQL (which natively supports storage of JSONs in the jsonb field type) to store configurations within an application. This JSON could be used when the country is selected to grab the specific configuration applicable to the scenario from the country and only populate the fields on the screen pertinent to the users location. This is a huge win as not only is it super flexible, but makes any developers you work with interested in the stuff those “data” guys are working on.
Model should be able to have country specific views generated for use in analytics
Yep. We can use the configuration file to generate the a country address format specific view. This will be an upcoming post where we weave some SQL wizardry and dynamic SQL to build something amazing
Model should be able to handle 1-N levels of Recipient information
This is in there. However the complexity surrounding this component is somewhat high, and I’m not 100% confident I didn’t mess something up here in the first pass. I will review and put out an overview in a future post of how this is enabled
Conclusions
This is a fun model. I know that this post was almost all text, but getting the model out is going to let us dive into a ton of topics way deeper, and this specific implementation will lead to some really “fun” (this is a serious use of the word fun, if it wasn’t I wouldn’t be doing these post in my spare time) examples in cool things that can be done in databases utilizing configurations. In the next post we’re going to get physical and attempt to test out the feasibility of our solution utilizing the configuration based approach for a US and non US address. Feel free to reach out with any comments or questions. Until next week.