Configuration Parameters

OmniSci has minimal configuration requirements with a number of additional configuration options. This topic describes the required and optional configuration changes you can use in your OmniSci instance.

Important With release 4.5.0 and higher, OmniSci now requires that all configuration flags used at startup match a flag on the OmniSci server. If any flag is misspelled or invalid, the server does not start. This change helps to ensure that all settings are intentional and will not have unexpected impact on performance or data integrity.

Data Directory

Before starting the OmniSci server, you must initialize the persistent data directory. To do so, create an empty directory at the desired path, such as /var/lib/omnisci. Create the environment variable $OMNISCI_STORAGE.

export OMNISCI_STORAGE=/var/lib/omnisci

Change the owner of the directory to the user that the server will run as ($OMNISCI_USER):

sudo mkdir -p $OMNISCI_STORAGE
sudo chown -R $OMNISCI_USER $OMNISCI_STORAGE

Where $OMNISCI_USER is the system user account that the server runs as, such as omnisci, and $OMNISCI_STORAGE is the path to the parent of the OmniSci server data directory.

Finally, run $OMNISCI_PATH/bin/initdb with the data directory path as the argument:

$OMNISCI_PATH/bin/initdb $OMNISCI_STORAGE

Configuration File

You can store options in a configuration file. This is useful if, for example, you need to run the OmniSci server and web server on ports different than the defaults.

If you store a copy of omnisci.conf in the $OMNISCI_STORAGE directory, the configuration settings are picked up automatically by the sudo systemctl start omnisci_server and sudo systemctl start omnisci_web_server commands.

Set the flags in the configuration file using the format <flag> = <value>. Strings must be enclosed in quotes. The following is a sample configuration file. The entry for data path is a string and must be in quotes. The last entry in the first section, for null-div-by-zero, is the Boolean value true and does not require quotes.

port = 6274 
http-port = 6278
data = "/var/lib/omnisci/data"
null-div-by-zero = true

[web]
port = 6273
frontend = "/opt/omnisci/frontend"
servers-json = "/var/lib/omnisci/servers.json"
enable-https = true

To comment out a line in omnisci.conf, prepend the line with the pound sign (#) character.

Command-Line Parameters

You can make ad hoc changes to your configuration by specifying parameters on the command line at run time. Prepend two hyphens to the parameter, followed by any required argument. For example, the following command starts the OmniSci server using a temporary configuration file.

sudo systemctl start omnisci_server --config ~/temp.conf

Configuration Parameters for OmniSci Server

These are the parameters for runtime settings on the OmniSci server and web server. The parameter syntax provides both the implied value and the default value as appropriate. Optional arguments are in square brackets, while implied and default values are in parentheses.

For example, consider allow-loop-joins [=arg(=1)] (=0).

  • If you do not use this flag, loop joins are not allowed by default.
  • If you provide no arguments, the implied value is 1 (true) (--allow-loop-joins).
  • If you provide the argument 0, that is the same as the default (--allow-loop-joins=0).
  • If you provide the argument 1, that is the same as the implied value (--allow-loop-joins=1).

Configuration Flags for OmniSci Server
Flag Description Implied Value Default Value Why Change It?
allow-cpu-retry [=arg] Allow the queries that failed on GPU to retry on CPU, even when watchdog is enabled.TRUE[1]TRUE[1] When watchdog is enabled, most queries that run on GPU and throw a watchdog exception fail. Turn this on to allow queries that fail the watchdog on GPU to retry on CPU. The default behavior is for queries that run out of memory on GPU to throw an error if watchdog is enabled. Watchdog is enabled by default.
allow-loop-joins [=arg(=1)] (=0) Enable loop joins.TRUE[1]FALSE[0] Enables the loop join implementation, as opposed to the default hash join implementation. Queries loop over all rows from all tables involved in the join, and evaluate the join condition. Loop joins can be effective when you compare a large inner dataset to a small outer dataset. When both datasets are large, performance is predictably slower.

In most scenarios, hash-join (default) performance is superior to loop-join performance. You might use loop joins when:
  • You cannot use a hash join. For example, the join condition does not exist, as in cross joins or in some geospatial cases where it is not possible to easily create the hash table on the GPU.
  • Hash join performance is slow, usually because of highly skewed data distribution, making hash table probes expensive).
For best performance, avoid using loop joins unless your requirements match one of these two scenarios.
bigint-count [=arg] Use 64-bit count.FALSE[0]FALSE[0] Disabled by default because 64-bit integer atomics are slow on GPUs. Enable this setting if you see negative values for a count, indicating overflow. In addition, if your data set has more than 4 billion records, you likely need to enable this setting.
calcite-max-mem arg Max memory available to calcite JVM.1024 Change if Calcite reports out-of-memory errors.
calcite-port arg Calcite port number.6279Change to avoid collisions with ports already in use.
config arg Path to omnisci.conf.$OMNISCI_STORAGEChange for testing and debugging.
cpu-only Run in CPU-only mode.FALSE Set this flag to force OmniSci Core to run in CPU mode, even when GPUs are available. Useful for debugging and on shared-tenancy systems where the current OmniSci Core instance does not need to run on GPUs.
cpu-buffer-mem-bytes arg Size of memory reserved for CPU buffers [bytes].0 Change to restrict the amount of CPU/system memory OmniSci Core can consume. A default value of 0 indicates no limit on CPU memory use. (OmniSci Server uses all available CPU memory on the system.)
cuda-block-size arg Size of block to use on GPU.0 GPU performance tuning: Number of threads per block. Default of 0 means use all threads per block.
cuda-grid-size arg Size of grid to use on GPU.0 GPU performance tuning: Number of blocks per device. Default of 0 means use all available blocks per device.
data arg Directory path to OmniSci catalogs.$OMNISCI_STORAGEChange for testing and debugging.
db-query-list argPath to file containing OmniSci queriesN/AN/AUse a query list to autoload data to GPU memory on startup to speed performance. See Preloading Data.
dynamic-watchdog-time-limit [=arg] Dynamic watchdog time limit, in milliseconds.10000100000 Change if dynamic watchdog is stopping queries expected to take longer than this limit.
enable-access-priv-check [=arg] Check user access privileges to database objects.TRUE[1]TRUE[1] Disables the privileges model. Essentially identical to running with superusers only.
enable-debug-timer [=arg] Enable fine-grained query execution timers for debug.TRUE[1]FALSE[0] For debugging, logs verbose timing information for query execution (time to load data, time to compile code, and so on).
enable-dynamic-watchdog [=arg] Enable dynamic watchdog.TRUE[1]FALSE[0]
enable-filter-push-down [=arg(=1)] (=0) Enable filter push-down through joins.TRUE[1]FALSE[0]Evaluates filters in the query expression for selectivity and pushes down highly selective filters into the join according to selectivity parameters. See also What is Predicate Pushdown?
enable-https-redirect [=arg] Enable a new port that omnisci_web_server listens on for incoming HTTP requests. When received, it returns a redirect response to the HTTPS port and protocol, so that browsers are immediately and transparently redirected.TRUE[1]FALSE[0] Use to provide an OmniSci front end that can run on both the HTTP protocol (http://my-omnisci-frontend.com) on default HTTP port 80, and on the primary HTTPS protocol (https://my-omnisci-frontend.com) on default https port 443, and have requests to the HTTP protocol automatically redirected to HTTPS. Without this, requests to HTTP fail. Assuming omnisci_web_server can attach to ports below 1024, the configuration would be:
enable-https-redirect = TRUE
http-to-https-redirect-port = 80
enable-overlaps-hashjoin [=arg(=1)] (=0) Enable the overlaps hash join framework allowing for range join (for example, spatial overlaps) computation using a hash table.TRUE[1]FALSE[0]
enable-watchdog [arg] Enable watchdog.TRUE[1]TRUE[1]
filter-push-down-low-fracHigher threshold for selectivity of filters which are pushed down. 0.1 Filters with selectivity lower than this threshold are considered for a push down.
filter-push-down-passing-row-uboundUpper bound on the number of rows that should pass the filter if the selectivity is less than the high fraction threshold.4000000
flush-log [arg]Immediately flush logs to disk.TRUE[1]TRUE[1]Set to FALSE if this is a performance bottleneck.
from-table-reordering [=arg(=1)] (=1)Enable automatic table reordering in FROM clauseTRUE[1]TRUE[1] Reorders the sequence of a join to place large tables on the inside of the join clause and smaller tables on the outside. OmniSci also reorders tables between join clauses to prefer hash joins over loop joins. Change this value only in consultation with an OmniSci engineer.
gpu-buffer-mem-bytes [=arg]Size of memory reserved for GPU buffers in bytes per GPU.0 Change to restrict the amount of GPU memory OmniSci Core can consume per GPU. A default value of 0 indicates no limit on GPU memory use (OmniSci Core uses all available GPU memory across all active GPUs on the system).
gpu-input-mem-limit argForce query to CPU when input data memory usage exceeds this percentage of available GPU memory.0.9 OmniSci Core loads data to GPU incrementally until data exceeds GPU memory, at which point the system retries on CPU. Loading data to GPU evicts any resident data already loaded or any query results that are cached. Use this limit to avoid attempting to load datasets to GPU when they obviously will not fit, preserving cached data on GPU and increasing query performance.
If watchdog is enabled and allow-cpu-retry is not enabled, the query fails instead of re-running on CPU.
hll-precision-bits [=arg]Number of bits used from the hash value used to specify the bucket number.1111 Change to increase or decrease approx_count_distinct() precision. Increased precision decreases performance.
http-port arg HTTP port number.6278Change to avoid collisions with ports already in use.
http-to-https-redirect-port argConfigures the http (incoming) port used by enable-https-redirect. The port option specifies the redirect port number.6280 Use to provide an OmniSci front end that can run on both the HTTP protocol (http://my-omnisci-frontend.com) on default HTTP port 80, and on the primary HTTPS protocol (https://my-omnisci-frontend.com) on default https port 443, and have requests to the HTTP protocol automatically redirected to HTTPS. Without this, requests to HTTP fail. Assuming omnisci_web_server can attach to ports below 1024, the configuration would be: enable-https-redirect = TRUE
http-to-https-redirect-port = 80
idle-session-duration argMaximum duration of an idle session, in minutes.60Change to increase or decrease duration of an idle session before timeout.
inner-join-fragment-skipping [=arg(=1)] (=0) Enable or disable inner join fragment skipping.TRUE[1]FALSE[0]Enables skipping fragments for improved performance during inner join operations.
license arg Path to the file containing the license key.Change if your license file is in a different location or has a different name.
max-session-duration arg Maximum duration of the active session, in minutes. 43200
(30 days)
Change to increase or decrease session duration before timeout.
null-div-by-zero [=arg] Allows processing to complete when when the dataset would cause a divide by zero error.0 Set to TRUE if you prefer to return null when dividing by zero, and set to FALSE to throw an exception.
num-gpus arg Number of GPUs to use.-1 In a shared environment, you can assign the number of GPUs to a particular application. The default, -1, uses all available GPUs. Use in conjunction with start-gpu.
num-reader-threads arg Number of reader threads to use.0 Drop the number of reader threads to prevent imports from using all available CPU power. Default is to use all threads.
overlaps-bucket-threshold arg (=0.10000000000000001)The minimum size of a bucket corresponding to a given inner table range for the overlaps hash join.0.10000000000000001
-p | port int Core server port.6274Change to avoid collisions with other services if 6274 is already in use.
read-only [=arg(=1)] Enable read-only mode.TRUE[1]FALSE[0] Prevents changes to the dataset.
render-mem-bytes arg Size of memory reserved for rendering [bytes].500000000 Performed at startup on each configured GPU, is static, and persists while the server is running unless you run \clear_gpu_memory. Increase if rendering a large number of points or symbols and you have get the following out-of-memory exception: Not enough OpenGL memory to render the query results

Default is 500 MB.
render-poly-cache-bytes arg Size of memory reserved for polygon rendering [bytes].300000000 Limits the maximum size of the polygon render cache. Use to improve polygon rendering performance from frame to frame when rendering the same query. Complex queries are often used with polygon rendering, such as choropleths that use expensive joins and aggregates. Processing time required to build polygon buffers for rendering can be expensive.

In contrast to render-mem-bytes, no allocation is performed at startup. If no polygon rendering is performed, no allocations are executed that count toward this limit. Polygon buffer allocations are performed dynamically when requested. If the query results and polygon buffer sizes exceed the limit of the cache, the render can still be executed as long as sufficient GPU memory is available. However, you may see performance degredation from frame to frame; if so, consider increasing this cache size.

The INFO log can provide information about the optimal setting. For example, if you see a log message like the following, you can extract the size in bytes to render a specific query and adjust this setting accordingly:

Cannot cache <size of all polygon render buffers> bytes (<size of polygon coordinate buffer> for vbo/ibo) for poly query: &lquery str> on gpu <gpu id>. There is currently <current size of poly cache> of <max size of poly cache> total bytes used in the poly cache.

Default is 300 MB.
rendering [=arg] Enable or disable backend rendering.TRUE[1]TRUE[1] Disable rendering when not in use, freeing up memory reserved by render-mem-bytes. To reenable rendering, you must restart OmniSci Server.
res-gpu-mem =argReserved memory for GPU, not use OmniSci allocator.134217728 Reserve extra memory for your system (for example, if the GPU is also driving your display, such as on a laptop or single-card desktop). OmniSci uses all the memory on the GPU except for render-mem-bytes + res-gpu-mem. All of render-mem-bytes is allocated at startup. Also useful if other processes, such as a machine-learning pipeline, share the GPU with OmniSci. In advanced rendering scenarios or distributed setups, increase to free up additional memory for the renderer, or for aggregating results for the renderer from multiple leaf nodes.
start-gpu arg First GPU to use.FALSE[0] Used in shared environments in which the first assigned GPU is not GPU 0. Use in conjunction with num-gpus.
trivial-loop-join-threshold [=arg] The maximum number of rows in the inner table of a loop join considered to be trivially small.10001000

Additional Enterprise Edition Parameters

cluster argPath to data leaves list JSON file. Indicates that the OmniSci server instance is an aggregator node, and where to find the rest of its cluster. $OMNISCI_STORAGEChange for testing and debugging.
compression-limit-bytes [=arg(=536870912)] (=536870912)Compress result sets that are transfered between leaves.536870912536870912Minimum length of payload above which data is compressed.
compressor arg (=lz4hc)Compressor algorithm to be used by the server to compress data being transferred between server.lz4hclz4hcSee Data Compression for compression algorithm options.
ha-brokers argLocation of the HA brokers.Kafka broker used for High Availability.
ha-group-id arg ID of the HA group this server is in. Match the group ID used for all servers in the OmniSci Core High Availability group.
ha-shared-path arg Directory path to shared OmniSci directory. Required for High Availability OmniSci Core setup. Specifies the shared file storage that allows multiple OmniSci Core servers to function as a High Availability cluster.
ha-unique-server-id arg Unique ID to identify this server in the HA group. Assign a unique ID to this server in the OmniSci High Availability group.
ldap-dn arg LDAP Distinguished Name.(=uid=%s, cn=users, cn=accounts, dc=omnisci, dc=com)
ldap-role-query-regex argRegEx to use to extract role from role query result.
ldap-role-query-url arg LDAP query role URL.
ldap-superuser-role argThe role name to identify a superuser.
ldap-uri arg LDAP server URI.
leaf-conn-timeout [=arg]Leaf connect timeout, in milliseconds.2000020000 Increase or decrease to fail Thrift connections between OmniSci Core instances more or less quickly if a connection cannot be established.
leaf-recv-timeout [=arg]Leaf receive timeout, in milliseconds.300000300000 Increase or decrease to fail Thrift connections between OmniSci Core instances more or less quickly if data is not received in the time allotted.
leaf-send-timeout [=arg]Leaf send timeout, in milliseconds.300000300000 Increase or decrease to fail Thrift connections between OmniSci Core instances more or less quickly if data is not sent in the time allotted.
saml-metadata-file arg Path to identity provider metadata file. Required for running SAML. An identity provider (like Okta) supplies a metadata file. From this file, OmniSci uses:
  1. Public key of the identity provider to verify that the SAML response comes from it and not from somewhere else.
  2. URL of the SSO login page used to obtain a SAML token.
saml-sp-target-url argURL of the service provider for which SAML assertions should be generated. Required for running SAML. Used to verify that a SAML token was issued for OmniSci and not for some other service.
saml-sync-roles arg (=0)Enable mapping of SAML groups to OmniSci roles.saml-sync-roles [ = 0]The SAML Identity provider (for example, Okta) automatically creates users at login and assigns them roles they already have as groups in SAML.
string-servers argPath to string servers list JSON file. Indicates that OmniSci Core is running in distributed mode and is required to designate a leaf server when running in distributed mode.

Configuration Parameters for OmniSci Web Server

Configuration Flags for OmniSci Web Server
Flag Description Default Why Change It?
-b | backend-url string URL to http-port on omnisci_server. http://localhost:6278 Change to avoid collisions with other services.
cert string Certificate file for HTTPS cert.pem Change for testing and debugging.
-c | config string Path to OmniSci configuration file.   Change for testing and debugging.
-d | data string Path to OmniSci data directory. data Change for testing and debugging.
db-query-list <path-to-query-list-file> Preload data to memory based on SQL queries stored in a list file. n/a Automatically run queries that load the most frequently used data to enhance performance. See Pre-loading Data.
docs string Path to documentation directory. docs Change if you move your documentation files to another directory.
enable-https Enable HTTPS support.   Change to enable secure HTTP.
-f | frontend string Path to frontend directory. frontend Change if you move the location of your frontend UI files.
key string Key file for HTTPS. key.pem Change for testing and debugging.
-p | port int Frontend server port. 6273 Change to avoid collisions with other services.
-r | read-only Enable read-only mode   Prevent changes to the data.
servers-json string Path to servers.json   Change for testing and debugging.
timeout duration Maximum request duration in #h#m#s format. For example 0h30m0s represents a duration of 30 minutes. 1h0m0s Controls the maximum duration of individual HTTP requests. Used to manage resource exhaustion caused by improperly closed connections.

This also limits the execution time of queries made over the Thrift HTTP transport. Increase the duration if queries are expected to take longer than the default duration of one hour; for example, if you COPY FROM a large file when using omnisql with the HTTP transport.
tmpdir string Path for temporary file storage. /tmp Used as a staging location for file uploads. Consider locating this directory on the same file system as the OmniSci data directory. If not specified on the command line, omnisci_web_server recognizes the standard TMPDIR environment variable as well as a specific OMNISCI_TMPDIR environment variable, the latter of which takes precedence. If you use neither the command-line argument nor one of the environment variables, the default, /tmp/ is used.
-v | verbose Enable verbose logging.   Adds log messages for debugging purposes.
version Return version.