Tutorial: Using SQL Extensions in Vega to Improve Rendering
Marks defined in Vega specify how to render data-backed geometric primitives for a visualization. Because these are visual primitives, the default units for defining position and size are in pixels. Pixel units usually are not directly representable by the data space, so the driving data must be mapped to pixel space to be used effectively. In many cases, this data space-to-pixel space mapping can be handled with scales. However, in a number of instances, particularly in geo-related cases, you want to size the primitives in world space units, such as meters. These units cannot be easily converted to pixel units using Vega scales.
This tutorial describes how to use available SQL extension functions in Vega to map meters to pixels, thereby improving map rendering.
Let's look at a basic example. The following uses a public polical contributions dataset, and draws circles for the points positioned using the GPS location of the contributor. The circles are colored by the recipient's political party affiliation and sized to be 10 pixels in diameter:
{ "width": 1146, "height": 1116, "data": [ { "name": "pointmap", "sql": "SELECT lon, lat, recipient_party, rowid FROM fec_contributions_oct WHERE (lon BETWEEN -119.49268182426508 AND -76.518508633361) AND (lat BETWEEN 21.99999999999997 AND 53.999999999999716) LIMIT 2000000" } ], "scales": [ { "name": "pointmap_fillColor", "type": "ordinal", "domain": ["D","R","I"], "range": ["deepskyblue","crimson","gold"], "default": "peru", "nullValue": "gainsboro" } ], "projections": [ { "name": "merc", "type": "mercator", "bounds": { "x": [-119.49268182426508,-76.518508633361], "y": [21.99999999999997,53.999999999999716] } } ], "marks": [ { "type": "symbol", "from": {"data": "pointmap"}, "properties": { "xc": {"field": "lon"}, "yc": {"field": "lat"}, "fillColor": {"scale": "pointmap_fillColor","field": "recipient_party"}, "shape": "circle", "width": 10, "height": 10 }, "transform": {"projection": "merc"} } ] }
The resulting render, composited over a basemap courtesy of Mapbox, looks like this:
Because the circles are sized using pixels, if you zoom in, the circles stay sized at a fixed 10 pixels. The size of the dots does not stay relative to the area of the map originally covered:
... "projections": [ { "name": "merc", "type": "mercator", "bounds": { "x": [-112.67762110616854,-112.15822455336946], "y": [40.30629722096336,40.69091660556256] } } ], ...
The resulting render in this case looks like this:
To keep the size of the points relative to an area on the map, you need to define the size of the pixels in meters. Currently, Vega does not provide a scale that maps meters in a mercator-projected space to pixel units. To bypass this limitation, you can use an OmniSci extension function that performs meters-to-pixels conversion using a mercator-projected space.
For scalar columns, such as lon/lat, use the following:
For geo POINT columns, you use:
Note | Because the extension functions can only return scalar values, each dimension (width and height) must have its own extension function. |
To apply these functions to the previous example, add these extension functions to your SQL code, and use the results of the extension functions to determine the width and height of the circles. The following example sizes the points to 1 km in diameter:
javascript { "width": 1146, "height": 1116, "data": [ { "name": "pointmap", "sql": "SELECT lon, lat, convert_meters_to_merc_pixel_width(1000, lon, lat, -119.49268182426508, -76.518508633361, 1146, 1) as width convert_meters_to_merc_pixel_height(1000, lon, lat, 21.99999999999997, 53.999999999999716, 1116, 1) as height, recipient_party as color, rowid FROM fec_contributions_oct WHERE (lon BETWEEN -119.49268182426508 AND -76.518508633361) AND (lat BETWEEN 21.99999999999997 AND 53.999999999999716) LIMIT 2000000" } ], "scales": [ { "name": "pointmap_fillColor", "type": "ordinal", "domain": ["D","R","I"], "range": ["deepskyblue","crimson","gold"], "default": "peru", "nullValue": "gainsboro" } ], "projections": [ { "name": "merc", "type": "mercator", "bounds": { "x": [-119.49268182426508,-76.518508633361], "y": [21.99999999999997,53.999999999999716] } } ], "marks": [ { "type": "symbol", "from": {"data": "pointmap"}, "properties": { "xc": {"field": "lon"}, "yc": {"field": "lat"}, "fillColor": {"scale": "pointmap_fillColor","field": "recipient_party"}, "shape": "circle", "width": {"field": "width"}, "height": {"field": "height"} }, "transform": {"projection": "merc"} } ] }
Note the differences in this Vega code compared to the earlier example; two projections were added to the SQL code:
convert_meters_to_merc_pixel_width(1000, lon, lat, -119.49268182426508, -76.518508633361, 1146, 1)
as widthconvert_meters_to_merc_pixel_height(1000, lon, lat, 21.99999999999997, 53.999999999999716, 1116, 1)
as height
This converts 1 km to a pixel value in width/height based on the current view of a mercator-projected map.
The width/height calculated here is now used to drive the width/height of the circle using this JSON in the Vega mark
:
"width": {"field": "width"}, "height": {"field": "height"}
The resulting render looks like this:
Now, if you zoom in, the size of the points stays relative to the map:
... "projections": [ { "name": "merc", "type": "mercator", "bounds": { "x": [-112.67762110616854,-112.15822455336946], "y": [40.30629722096336,40.69091660556256] } } ], ...
...with the following resulting render:
The following code zooms in a bit more:
... "projections": [ { "name": "merc", "type": "mercator", "bounds": { "x": [-112.52569969159018,-112.37518840098163], "y": [40.527435942877986,40.63875135019538] } } ], ...
and results in the following render:
Notice that the WHERE
clause of the SQL filters out points not in view:
... WHERE (lon BETWEEN -119.49268182426508 AND -76.518508633361) AND (lat BETWEEN 21.99999999999997 AND 53.999999999999716) ...
However, when zoomed in far enough, a point can disappear, even though its associated circle is still in view. This occurs because only the center of the circle is checked in this filter and not the whole rendered circle.
To illustrate this, consider a render of the following query:
SELECT lon, lat, convert_meters_to_merc_pixel_width(1000, lon, lat, -112.49286564041051, -112.34235434980197, 1146, 1) as width, convert_meters_to_merc_pixel_height(1000, lon, lat, 40.53172840847458, 40.64303667787769, 1116, 1) as height, recipient_party, rowid FROM fec_contributions_oct WHERE (lon BETWEEN -112.49286564041051 AND -112.34235434980197) AND (lat BETWEEN 40.53172840847458 AND 40.64303667787769) LIMIT 2000000
The resulting image looks like this:
If you pan to the left, the blue dot disappears, although it should still be visible. Here is the query:
SELECT lon, lat, convert_meters_to_merc_pixel_width(1000, lon, lat, -112.48984490770093, -112.33933361709238, 1146, 1) as width, convert_meters_to_merc_pixel_height(1000, lon, lat, 40.5315287650088, 40.642837366411584, 1116, 1) as height, recipient_party, rowid FROM fec_contributions_oct WHERE (lon BETWEEN -112.48984490770093 AND -112.33933361709238) AND (lat BETWEEN 40.5315287650088 AND 40.642837366411584) LIMIT 2000000
...and the resulting image:
To alleviate this issue, you can use the extension functions as a filter:
- For scalar columns (such as lon/lat):
is_point_size_in_merc_view
- For geo POINT columns:
is_point_size_in_view
These extension functions take as arguments the parameters of the view along with the point size in meters, and return true
if the point is in the defined view, or false
otherwise.
Refering back to the original example, replace the WHERE
clause with its is_point_size_in_merc_view
equivalent:
SELECT lon, lat, convert_meters_to_merc_pixel_width(1000, lon, lat, -112.49286956397471, -112.34028759586535, 1146, 1) as width, convert_meters_to_merc_pixel_height(1000, lon, lat, 40.531763370983555, 40.64460162657784, 1116, 1) as height, recipient_party, rowid FROM fec_contributions_oct WHERE is_point_size_in_merc_view(lon, lat, 1000, -112.49286956397471, -112.34028759586535, 40.531763370983555, 40.64460162657784) LIMIT 2000000
This results in:
Now, pan slightly to the left again:
SELECT lon, lat, convert_meters_to_merc_pixel_width(1000, lon, lat, -112.48980727316768, -112.33722530505833, 1146, 1) as width, convert_meters_to_merc_pixel_height(1000, lon, lat, 40.531763370983555, 40.64460162657784, 1116, 1) as height, recipient_party, rowid FROM fec_contributions_oct WHERE is_point_size_in_merc_view(lon, lat, 1000, -112.48980727316768, -112.33722530505833, 40.531763370983555, 40.64460162657784) LIMIT 2000000
The result is:
Notice that the blue dot now passes the filter and stays in view.
Current Limitations
- This approach is not an accurate representation of area on a map. It provides a reasonable approximate, but more error is introduced as you approach the poles, because this approach works only in two dimensions. As you approach the poles, you would realistically see areas that are oblong and egg-shaped. However, this approach works reasonably well for most inhabitable geo locations.
- The
symbol
mark types are procedurally generated and use a simple POINT primitive in the underlying graphics API. This primitive has a maximum pixel size for this primitive. The limit is graphics driver–implementation defined, but testing shows this limit to be 2000 pixels in diameter. This limit can have an effect if you zoom in tight on areas where the circles have large areas. You may see points disappear, similar to the filtering issue described earlier. This most likely occurs because the ultimate width/height generated by theconvert_meters_to_pixels
extension functions exceed this limit.As a workaround, use the
legacysymbol
mark type instead ofsymbol
. Thelegacysymbol
mark type does not render the shape procedurally, so it is not affected by this limit. Thelegacysymbol
mark was deprecated in favor of the improved rendering performance of the procedural approach. - When you use extension functions in SQL, you cannot use Vega scales to do further mapping; for example, you cannot use the contribution "amount" column to drive the size of the points in meters with a Vega scale. Any additional mapping must be done in the SQL, which may not be trivial depending on the complexity of the mapping.