That was the blog post for the initial release, and a lot of things have changed since then (definitely deserves a new blog post ^^).
The first big change happened six months after the release, when I rewrote most of the geometrical algorithms (leveraging the excellent geo crate) and got a massive boost in speed and reduction in memory usage which made it applicable at high resolutions and country-scale levels (e.g. some computation went from 15h to 7min, and from 18GB of RAM to 100MB). I also added support for alternative coverage methods (back then H3 only offered centroid containment).
Since then, the reference implementation has caught up in term of coverage predicate and even provides a new experimental coverage algorithm addressing some performance issue. I haven’t implemented yet but, IIRC, my current implementation still outperforms theirs (but less dramatically so).
I’ve also developed a little ecosystem of libraries around h3o:
- Tailored compression algorithm with h3o-zip (in optimal cases I’ve observed reductions from ~2GB to 100KB)
- Compact data structure for fast lookup with h3o-ice (based on FST)
- Map rendering with h3o-mvt
Most of these things run in production at Amo, where one of the main use cases is powering the Scratchmap feature, both client and server side, in the Bump app.
I’ve also seen adoption from other projects (bindings for R, Erlang, Polars, ...) and enterprises :)
Very impressive results, cool to see innovation in this space! I’d definitely be interested in a follow up post going into the details of the geometric algorithms.
I’m working on my own DGGS, A5, the first (and only) to use pentagons. It offers true equal area cells and a much higher cell fidelity (below 1cm compared to 1m for H3).
Ha, yeah, I remember reading about your project back in April (I think someone shared it on the GeoRust Discord).
Really cool stuff you have here!
Can't say I understand all the math behind it, as it's not my forte (even for H3, for the more numerical parts, I rely on the work of the original authors: I could never have come up with this myself), but your doc is really great!
For the follow-up article, I hope I can get to it eventually.
But spare time is a rare currency ^^
h3o-zip is really impressive! I've been wanting to play around with it more, and I've been meaning to ask you if you have any good references for that encoding approach. I understand how it works in h3o-zip, but I'd be interested to know more about where else that approach has been used.
H3o is an awesome piece of work. I created polars bindings for it(another reason I love polars) and last time I benchmarked it, it had 5X better performance than even duckdb’s C++ implementation.
The main advantages of hexagons are that the distance to each neighbour is always the same, and the distortion across the globe is much less, because of the way H3 creates its grid (compared to the earlier Google S2 which uses squares and distorts a lot). There’s an excellent Uber blog post about this, I’ll see if I can find the link.
One of the big ones that hasn't been mentioned is all of a hexagon's neighbors are equidistant. As a result, h3 is a better fit for flow modeling - stuff like telematics. This has some nice properties for ML too.
The main advantage of hexagonal spherical tiling systems is that they are roughly equal area at a given resolution. This makes them particularly suitable for generating visualizable aggregates when you primarily care about spatial distribution rather than specific boundaries (like borders).
The main disadvantage of non-congruent tiling systems like H3 is poor scalability and performance when running analytical computations. In most cases you wouldn't want to shard your underlying data this way even if this is how you want to visualize it.
It is easy to get the best of both worlds. You can shard data models as 3-space spherical embeddings (efficient for large-scale analytic computation) and convert query results to an H3 tiling at wire speed on demand.
This is cool. Not personally going to switch because I want to stay with the official implementation, but I appreciate the effort involved in porting libraries.
As one example, the U.S. Federal Communications Commission uses it in its Broadband Data Collection program. You can see some of how it's been implemented here: https://broadbandmap.fcc.gov/
Edit: It seems some people get a blocked message when visiting the base url. The home path may work better? https://broadbandmap.fcc.gov/home
H3 was integrated into ClickHouse in 2019, and since then, I have heard many interesting stories. There are unusual ones, e.g., when it is used not to map data on Earth, but for astronomy (stars, galaxies).
amo is using it quite a lot, mainly for the scratch map feature in the Bump Map application, but not only.
Use cases are:
- data storage
- data aggregation/clustering
- spatial indexing
- geometrical computation (as long as you're OK with approximation, you can speed up a lot of things by working with CellID instead of actual geometries)
- data visualization
I've seen it used by Databend, Helium, Breakroom (they did an Erlang binding on top of h3o), beaconDB, Greptime, Meilisearch.
But I don't exactly know what they are using it for (just that they pulled h3o in their projects).
I’ve used h3 for a game. Since they align with an unique hex, I can ensure that one cell grid aligns and is placed on the same place in the world, where players could then compete on.
I never understood why anyone would prefer the H3 hex tiles over Google’s much simpler S2 system: http://s2geometry.io/
Sure, hex tiles make certain circular nearest neighbour searches slightly more accurate… but still have an error.
And then… everything else that’s inconvenient with hex tiling, like that issue that subdivisions of a cell leak into the neighbouring cells and hence don’t add up to 100% of the parent! This makes many database queries return lies, or the queries need very complex and slow(!) code to compensate.
Some of the original developers of H3 gave a presentation about it that goes over the tradeoffs between those different systems, would recommend watching it.
For my use case, the visual distortion of S2 was quite a no-go.
As for DB queries, it really depends on your use case and how you store your data, but you can get some good results.
But yeah, if you really need exact parent-child containment, S2 is easier to work with.
It also uses pentagons in some places because a hexagonal grid can‘t tile a sphere.
They made sure that the pentagons are located in water, but this feels like it will add even more edge cases to handle.
Author here!
Funny to see this on the front page xD
That was the blog post for the initial release, and a lot of things have changed since then (definitely deserves a new blog post ^^).
The first big change happened six months after the release, when I rewrote most of the geometrical algorithms (leveraging the excellent geo crate) and got a massive boost in speed and reduction in memory usage which made it applicable at high resolutions and country-scale levels (e.g. some computation went from 15h to 7min, and from 18GB of RAM to 100MB). I also added support for alternative coverage methods (back then H3 only offered centroid containment).
Since then, the reference implementation has caught up in term of coverage predicate and even provides a new experimental coverage algorithm addressing some performance issue. I haven’t implemented yet but, IIRC, my current implementation still outperforms theirs (but less dramatically so).
I’ve also developed a little ecosystem of libraries around h3o: - Tailored compression algorithm with h3o-zip (in optimal cases I’ve observed reductions from ~2GB to 100KB) - Compact data structure for fast lookup with h3o-ice (based on FST) - Map rendering with h3o-mvt
Most of these things run in production at Amo, where one of the main use cases is powering the Scratchmap feature, both client and server side, in the Bump app. I’ve also seen adoption from other projects (bindings for R, Erlang, Polars, ...) and enterprises :)
Very impressive results, cool to see innovation in this space! I’d definitely be interested in a follow up post going into the details of the geometric algorithms.
I’m working on my own DGGS, A5, the first (and only) to use pentagons. It offers true equal area cells and a much higher cell fidelity (below 1cm compared to 1m for H3).
I’m looking for contributors to get involved and you seem to have the perfect skill set. It would be amazing to have you join the project :) https://a5geo.org/ https://github.com/felixpalmer/a5
Ha, yeah, I remember reading about your project back in April (I think someone shared it on the GeoRust Discord). Really cool stuff you have here!
Can't say I understand all the math behind it, as it's not my forte (even for H3, for the more numerical parts, I rely on the work of the original authors: I could never have come up with this myself), but your doc is really great!
For the follow-up article, I hope I can get to it eventually. But spare time is a rare currency ^^
h3o-zip is really impressive! I've been wanting to play around with it more, and I've been meaning to ask you if you have any good references for that encoding approach. I understand how it works in h3o-zip, but I'd be interested to know more about where else that approach has been used.
H3o is an awesome piece of work. I created polars bindings for it(another reason I love polars) and last time I benchmarked it, it had 5X better performance than even duckdb’s C++ implementation.
https://github.com/Filimoa/polars-h3
I really want to do a DuckDB extension someday, I think it would be pretty cool. I had looked into it 2 years ago but didn't dig further.
Now that H3 provides one, maybe I should take another look at it.
TIL! What are the advantages of hexagonal spatial indexing compared to e.g. quad trees, r-trees?
The main advantages of hexagons are that the distance to each neighbour is always the same, and the distortion across the globe is much less, because of the way H3 creates its grid (compared to the earlier Google S2 which uses squares and distorts a lot). There’s an excellent Uber blog post about this, I’ll see if I can find the link.
(here’s the blog post: https://www.uber.com/en-GB/blog/h3/ )
One of the big ones that hasn't been mentioned is all of a hexagon's neighbors are equidistant. As a result, h3 is a better fit for flow modeling - stuff like telematics. This has some nice properties for ML too.
You can see one of my jupyter notebooks that dives deep into this with h3 here: https://drive.google.com/file/d/18jIVEbE_1QbwTbHdMqj0AVqguf2...
The main advantage of hexagonal spherical tiling systems is that they are roughly equal area at a given resolution. This makes them particularly suitable for generating visualizable aggregates when you primarily care about spatial distribution rather than specific boundaries (like borders).
The main disadvantage of non-congruent tiling systems like H3 is poor scalability and performance when running analytical computations. In most cases you wouldn't want to shard your underlying data this way even if this is how you want to visualize it.
It is easy to get the best of both worlds. You can shard data models as 3-space spherical embeddings (efficient for large-scale analytic computation) and convert query results to an H3 tiling at wire speed on demand.
It is a common misconception that h3 is equal area. At any resolution level the cell size varies by a factor of 2, which is (roughly) the same as S2.
See the following visualizations for an illustration:
https://a5geo.org/examples/area
https://a5geo.org/examples/airbnb
This is cool. Not personally going to switch because I want to stay with the official implementation, but I appreciate the effort involved in porting libraries.
I wonder why healpix never made a footing outside of cosmology. I suppose nobody likes spherical harmonics as much as physicists.
who is using uber h3 and what for? (besides uber of course)
As one example, the U.S. Federal Communications Commission uses it in its Broadband Data Collection program. You can see some of how it's been implemented here: https://broadbandmap.fcc.gov/
Edit: It seems some people get a blocked message when visiting the base url. The home path may work better? https://broadbandmap.fcc.gov/home
That's really neat. Here's a screenshot for people who can't access that URL: https://gist.github.com/simonw/eb31ec34af16a1e19ee0d7ca90e8a...
I cannot... Geoblockig European IPs?
I'm blocked and I'm in the US
Maybe the home path will work better? https://broadbandmap.fcc.gov/home
H3 was integrated into ClickHouse in 2019, and since then, I have heard many interesting stories. There are unusual ones, e.g., when it is used not to map data on Earth, but for astronomy (stars, galaxies).
I'm curious about those interesting stories ^^ Care to share a bit more?
Overture maps docs use it to visualize the coverage of Overture address data.
https://docs.overturemaps.org/guides/addresses/
Picture url: https://docs.overturemaps.org/assets/images/address-coverage...
H3 is commonly used for creating visualization aggregates e.g. creating visual summaries of data distribution. That was its primary design case.
amo is using it quite a lot, mainly for the scratch map feature in the Bump Map application, but not only.
Use cases are: - data storage - data aggregation/clustering - spatial indexing - geometrical computation (as long as you're OK with approximation, you can speed up a lot of things by working with CellID instead of actual geometries) - data visualization
I've seen it used by Databend, Helium, Breakroom (they did an Erlang binding on top of h3o), beaconDB, Greptime, Meilisearch. But I don't exactly know what they are using it for (just that they pulled h3o in their projects).
We use it at Neighbor.com for a lot of data analysis in our marketplace, things like our price recommendations, supply and demand balances, etc.
I’ve used h3 for a game. Since they align with an unique hex, I can ensure that one cell grid aligns and is placed on the same place in the world, where players could then compete on.
This is pretty big for WASM projects
I never understood why anyone would prefer the H3 hex tiles over Google’s much simpler S2 system: http://s2geometry.io/
Sure, hex tiles make certain circular nearest neighbour searches slightly more accurate… but still have an error.
And then… everything else that’s inconvenient with hex tiling, like that issue that subdivisions of a cell leak into the neighbouring cells and hence don’t add up to 100% of the parent! This makes many database queries return lies, or the queries need very complex and slow(!) code to compensate.
H3 is preferred for geo analytics because it produces a more uniform spatial index with low distortion and consistent distances between cells
Its primary use case was efficient spatial aggregation for applications like pricing, demand forecasting, positioning etc.
Some of the original developers of H3 gave a presentation about it that goes over the tradeoffs between those different systems, would recommend watching it.
https://www.youtube.com/watch?v=wDuKeUkNLkQ
They have a page about pros and cons: https://h3geo.org/docs/comparisons/s2/
For my use case, the visual distortion of S2 was quite a no-go.
As for DB queries, it really depends on your use case and how you store your data, but you can get some good results. But yeah, if you really need exact parent-child containment, S2 is easier to work with.
It also uses pentagons in some places because a hexagonal grid can‘t tile a sphere. They made sure that the pentagons are located in water, but this feels like it will add even more edge cases to handle.