Frontend and backend protocols

Vertica uses a message-based protocol for communication between frontends and backends (clients and servers). The protocol is supported over TCP/IP sockets. This document describes latest version of the protocol.

For purposes of the protocol, the terms “backend” and “server” are interchangeable; likewise “frontend” and “client” are interchangeable. See vertica-python for a frontend implementation reference of the protocol.

Overview

The protocol has separate phases for startup and normal operation.

In the startup phase, the frontend opens a connection to the backend and authenticates itself to the satisfaction of the backend. (This might involve a single message, or multiple messages depending on the authentication method being used.) If all goes well, the backend then sends status information to the frontend, and finally enters normal operation. Except for the initial startup-request message, this part of the protocol is driven by the backend.

During normal operation, the frontend sends queries and other commands to the backend, and the backend sends back query results and other responses. There are a few cases wherein the backend will send unsolicited messages, but for the most part this portion of a session is driven by frontend requests.

Termination of the session is normally by frontend choice, but can be forced by the backend in certain cases. In any case, when the backend closes the connection, it will roll back any open (incomplete) transaction before exiting.

Within normal operation, SQL commands can be executed through either of two sub-protocols. In the “simple query” protocol, the frontend just sends a textual query string, which is parsed and immediately executed by the backend. In the “extended query” protocol, processing of queries is separated into multiple steps: parsing, binding of parameter values, and execution. This offers flexibility and performance benefits, at the cost of extra complexity.

Normal operation has additional sub-protocols for special operations such as COPY.

All communication is through a stream of messages. The byte order of all messages are big-endian (A special case in WriteFile message). All data sending to or coming from the backend are represented as UTF-8.

Formats and Format Codes

Data of a particular SQL data type might be transmitted in either “text” format or “binary” format. The desired format for any value is specified by a format code. Text has format code zero, binary has format code one, and all other format codes are reserved for future definition.

The text representation of values is whatever strings are produced and accepted by the input/output conversion functions for the particular data type. For example, the value of DATE/TIMESTAMP data is a date/timestamp in proleptic Gregorian calendar.

Binary Format Encoding For Each Data Type

Data Type	Length (bytes)	Data Content
BOOLEAN	1	0 for false, 1 for true.
CHAR	User-specified	A sequence of UTF-8 encoding bytes. UTF-8 strings can contain multi-byte characters. Therefore, number of characters in the string may not equal the number of bytes. Strings shorter than the specified length are right-padded with spaces.
VARCHAR	Variable-length	A sequence of UTF-8 encoding bytes. Remember that UTF-8 strings can contain multi-byte characters. Therefore, number of characters in the string may not equal the number of bytes.
LONG VARCHAR	Variable-length
INTEGER	8	64-bit integer.
FLOAT	8	Encoded in IEEE-754 format.
NUMERIC	N (N = (precision//19+1)*8)	N-byte signed integer containing the unscaled value. Scale is calculated separately with RowDescription's type modifier.
DATE	8	64-bit integer containing the Julian day number (JDN) for the date.
INTERVAL	8	64-bit integer containing the number of microseconds in the interval.
INTERVALYM	8	64-bit integer containing the number of months in the interval.
TIME	8	64-bit integer containing the number of microseconds since midnight in the UTC time zone.
TIMETZ	8	64-bit value where: Upper 40 bits contain the number of microseconds since midnight in the UTC time zone. Lower 24 bits contain time zone as the UTC offset in seconds calculated as follows: Actual time zone = 86400(24hrs) - this 3-byte value as integer
TIMESTAMP	8	64-bit integer containing the number of microseconds since Julian day: Jan 01 2000 00:00:00.
TIMESTAMPTZ	8	64-bit integer containing the number of microseconds since Julian day: Jan 01 2000 00:00:00 in the UTC timezone.
UUID	16	A 128-bit value interpreted as UUID.
BINARY	User-specified	A sequence of bytes. If the value is smaller than the specified length, the remainder should be filled with nulls (\x00).
VARBINARY	Variable-length	A sequence of bytes.
LONG VARBINARY	Variable-length	A sequence of bytes.
Complex types (ARRAY, MAP, ROW, SET)	/	Data content is the same as TEXT format.

Note: While PostgreSQL lets you change the format on a per-column/type basis in the protocol, Vertica currently does not honor that and instead does it as a session-level setting. So unfortunately that means its an all-or-nothing implementation - all data types must be implemented. To opt into binary format protocol for data reads from backend (queries), the frontend have to set the StartupRequest message to include parameter ‘binary_data_protocol’ set to ‘1’ (‘0’ is the default, which is the text format protocol), and set the Bind message’s last two fields (result-column format code) as ‘1’. No Vertica clients currently support data writes (server-prepared statement bind values) to the backend in binary format.

Message Flow

This section describes the message flow and the semantics of each message type. (Details of the exact representation of each message appear in Section Message Formats.)

Start-up

To begin a session, a frontend opens a connection over TCP/IP to the server. With a TCP socket, the frontend does the following steps:

Connection load balancing

A backend with load balancing enabled can redirect connections based on the connection’s origin. To initiate a connection load balancing, the frontend sends a LoadBalanceRequest message. The server then responds with a LoadBalanceResponse message, the first byte of the message containing Y or N, indicating that it is willing or unwilling to perform load balancing, respectively. To continue after Y, the frontend must create a new TCP socket, which connects to the new host/port specified in the LoadBalanceResponse ‘Y’ message. To continue N, send the usual SSLRequest message and Startup message and proceed without connection load balancing.

To enable connection load balancing on the server side, please see Vertica documentation.

Workload routing is different from connection load balancing, which happens after authentication. The node the client connects to would forward traffic to the execution node. In other words, the connection node acts as a proxy between the client and the execution node.

SSL session encryption

Frontend/backend communications can be encrypted using TLS/SSL. This provides communication security in environments where attackers might be able to capture the session traffic.

To initiate an SSL-encrypted connection, the frontend sends an SSLRequest message. The server then responds with a single byte containing S or N, indicating that it is willing or unwilling to perform SSL, respectively. The frontend might close the connection at this point if it is dissatisfied with the response. To continue after S, perform an SSL startup handshake (not described here, part of the SSL specification) with the server. If this is successful, all subsequent data will be SSL-encrypted. To continue after N, send the usual Startup message and proceed without encryption.

To enable TLS/SSL communications on the server side, please see Vertica documentation.

Startup message and authentication

Once the socket setup is done, the frontend sends a StartupRequest message to the backend. This message includes the names of the user and of the database the user wants to connect to; it also identifies the particular protocol version to be used. Optionally, the StartupRequest message can include additional settings for run-time parameters. The server then uses this information to determine whether the connection is provisionally acceptable, and what additional authentication is required (if any).

The server then sends an appropriate authentication request message, to which the frontend must reply with an appropriate authentication response message (such as a Password).

The authentication cycle ends with the server either rejecting the connection attempt (ErrorResponse), or sending AuthenticationOk.

The possible messages from the server in this phase are:

ErrorResponse: The connection attempt has been rejected. The server then immediately closes the connection.
AuthenticationOk: The authentication exchange is successfully completed.
AuthenticationCleartextPassword: The frontend must now send a Password message containing the password in clear-text form. If this is the correct password, the server responds with an AuthenticationOk, otherwise it responds with an ErrorResponse.
AuthenticationGSS: The frontend must now initiate a GSSAPI negotiation. The frontend will send a Password message with the first part of the GSSAPI data stream in response to this. If further messages are needed, the server will respond with AuthenticationGSSContinue. See GSS-API/Kerberos Authentication.
AuthenticationGSSContinue: This message contains the response data from the previous step of GSSAPI negotiation (AuthenticationGSS, or a previous AuthenticationGSSContinue). If the GSSAPI data in this message indicates more data is needed to complete the authentication, the frontend must send that data as another Password message. If GSSAPI authentication is completed by this message, the server will next send AuthenticationOk to indicate successful authentication or ErrorResponse to indicate failure.
AuthenticationPasswordExpired: The connection attempt has been rejected as the password is expired.
AuthenticationPasswordGrace: This message informs the frontend that the password will expire soon. The frontend can display a warning message. Then, the frontend must wait for further messages from the server.
AuthenticationMD5Password or AuthenticationHashMD5Password: The frontend must now send a Password message containing the password (with username) encrypted via MD5, then encrypted again using the 4-byte random salt specified in this message. If this is the correct password, the server responds with an AuthenticationOk, otherwise it responds with an ErrorResponse.
AuthenticationHashPassword or AuthenticationHashSHA512Password: The frontend must now send a Password message containing the password (with the 16-byte user salt specified in this message) encrypted via SHA512, then encrypted again using the 4-byte random salt specified in this message. If this is the correct password, the server responds with an AuthenticationOk, otherwise it responds with an ErrorResponse.

If the frontend does not support the authentication method requested by the server, then it should immediately close the connection.

After having received AuthenticationOk, the frontend must wait for further messages from the server. In this phase a backend process is being started, and the frontend is just an interested bystander. It is still possible for the start-up attempt to fail (ErrorResponse), but in the normal case the backend will send some ParameterStatus messages, BackendKeyData, and finally ReadyForQuery.

During this phase the backend will attempt to apply any additional run-time parameter settings that were given in the StartupRequest message. If successful, these values become session defaults. An error causes ErrorResponse and exit.

The possible messages from the backend in this phase are:

ParameterStatus: This message informs the frontend about the current (initial) setting of backend parameters, such as client_locale or auto_commit. The frontend can ignore this message, or record the settings for its future use. The frontend should not respond to this message, but should continue listening for a ReadyForQuery message.
BackendKeyData: This message provides secret-key data that the frontend must save if it wants to be able to issue cancel requests later. The frontend should not respond to this message, but should continue listening for a ReadyForQuery message.
ReadyForQuery: Start-up is completed. The frontend can now issue commands.
ErrorResponse: Start-up failed. The connection is closed after sending this message.
NoticeResponse: A warning message has been issued. The frontend should display the message but continue listening for ReadyForQuery or ErrorResponse.

The ReadyForQuery message is the same one that the backend will issue after each command cycle. Depending on the coding needs of the frontend, it is reasonable to consider ReadyForQuery as starting a command cycle, or to consider ReadyForQuery as ending the start-up phase and each subsequent command cycle.

Authentication

GSS-API / Kerberos Authentication

GSS-API is Generic Security Service API (RFC 2744). It provides a common interface for accessing different security services. One of the most popular security services available for GSS-API is the Kerberos v5.

Here are the most basic steps taken to authenticate in a Kerberized environment:

Client requests an authentication ticket (TGT) from the Key Distribution Center (KDC).
The KDC verifies the credentials and sends back an encrypted TGT and session key.
The TGT is encrypted using the Ticket Granting Service (TGS) secret key.
The client stores the TGT and when it expires the local session manager will request another TGT (this process is transparent to the user).

kinit in Linux is a command often used for obtaining or caching/renewing a Kerberos ticket-granting ticket (TGT).

When the client requests access to a server, this is the process:

The client sends a StartupRequest message to the server.
The server sends back a AuthenticationGSS message to request GSS-API negotiation.
The client sends the current TGT to the TGS with the Service Principal Name (SPN) of the Vertica server.
The KDC verifies the TGT of the user and that the user has access to the Vertica server service.
TGS sends a valid session key for the service to the client.
Client forwards the session key (within a Password message) to the server to prove the user has access.
If further messages are needed, the server will respond with a AuthenticationGSSContinue message. The client sends data as another Password message.
The server grants access and sends a AuthenticationOk; or sends a ErrorResponse to indicate failure.

Example message flow of GSS authentication

OAuth2 authentication

OAuth lets client applications verify themselves to and receive an OAuth access token from an identity provider (IDP) like Keycloak or Okta. These client applications can then pass the access token to authenticate to Vertica server as a substitute for username & password.

A client user should first configure the IDP and get the following parameters:

client_id: ID of the confidential client (application) registered in IDP. This is used by server to call introspection API to retrieve user grants; and used by client to retrieve tokens.
client_secret: The secret of the confidential client (application) registered in IDP. This is used by server to call introspection API to retrieve user grants; and used by client to retrieve tokens.
introspect_url: API used by server to introspect access token. This URL is exposed by IDP and must be accessible from server.
token_url: API used by client to refresh the token. This URL is exposed by IDP and must be accessible from client.

Then in the server, create an OAuth authentication record with above parameters, and grant it to the user.

The client application or user call the IDP token API to retrieve the following parameters:

access_token: Access tokens are the thing that applications use to make connection requests on behalf of a user.
refresh_token: Refresh tokens are used to obtain a new access token.

Example message flow of OAuth authentication (Protocol 3.11)

Example message flow of OAuth access token refresh (Protocol 3.11)

(Protocol 3.11)

In the StartupRequest, only access_token is required to be specified in oauth_access_token parameter, no client id/secret is included in the request. The ‘user’ parameter can be an empty string. The backend validates token using OAuth Introspect query. In case oauth_access_token is valid and permissions are sufficient, the backend sends AuthenticationOk, otherwise it responds with an ErrorResponse. Token refresh flow can be triggered by frontend after the token validation (if token introspection fails).

Note: It is not recommended for clients to implement this protocol version by default. Protocol 3.12 is the right way to pass oauth_access_token to servers. For backwards compatibility with v11.1SP1 - v12.0SP1 servers, clients can implement a parameter to include the oauth_access_token in the startup request.

(Protocol 3.12)

In the StartupRequest, if the ‘auth_category’ parameter is specified as “OAuth”, the server will send the client an AuthenticationOAuth message. The client will respond with a Password message containing an OAuth access token. The backend validates token using OAuth Introspect query. In case the access token is valid and permissions are sufficient, the backend sends AuthenticationOk, otherwise it responds with an ErrorResponse. Token refresh flow can be triggered by frontend after the token validation (if token introspection fails).

If ‘auth_category’ parameter is not set, the server still check for the token in the ‘oauth_access_token’ parameter of StartupRequest message.

Example message flow of OAuth authentication (Protocol 3.12)

(Protocol 3.15)

In OAuth 2.0, the term “grant type” refers to the way an application gets an access token. OAuth 2.0 defines several grant types, including the authorization code flow, which requires the app launch a browser to begin the flow.

New fields (Auth URL, Token URL, client_id) added in the AuthenticationOAuth message to support OAuth browser workflow (i.e. Authorization Code grant type).

At a high level, the authorization code flow has the following steps:

The client application opens a browser to send the user to the OAuth server (Auth URL)
The user sees the authorization prompt and approves the app’s request
The user is redirected back to the client application with an authorization code in the query string
The client application exchanges the authorization code (with Token URL) for an access token

client_id is required every time a HTTP request is issued to the IDP. client_secret is required only for requests to the Token URL, but it is not allowed to be sent by the server for any reason. So the client application needs to provide the client_secret to the IDP to retrieve an access token.

Example message flow of OAuth browser workflow (Protocol 3.15)

(Protocol 3.16)

New fields (scope, validate_hostname) added in the AuthenticationOAuth message.

Simple Query

A simple query cycle is initiated by the frontend sending a Query message to the backend. The message includes an SQL command (or commands) expressed as a text string. The backend then sends one or more response messages depending on the contents of the query command string, and finally a ReadyForQuery response message. ReadyForQuery informs the frontend that it can safely send a new command.

Example message flow of simple query with data rows returned

The possible response messages from the backend are:

CommandComplete
An SQL command completed normally.
RowDescription
Indicates that rows are about to be returned in response to a query. The contents of this message describe the column layout of the rows. This will be followed by a DataRow message for each row being returned to the frontend.
VerifyFiles
The backend starts processing a COPY LOCAL command; see Section COPY Operations – Copy-Local mode.
CopyInResponse
The backend starts processing a COPY STDIN command; see Section COPY Operations – Copy-Stdin mode
DataRow
One of the set of rows returned by a query.
EmptyQueryResponse
An empty query string was recognized.
ParameterStatus
This message informs the frontend about the current setting of backend parameters, a SET query normally produce this message.
NoticeResponse
A warning message has been issued in relation to the query. For example, issue a COMMIT when no transaction in progress. NoticeResponses are in addition to other responses, i.e., the backend will continue processing the command.
ErrorResponse
An error has occurred.
ReadyForQuery
Processing of the query string is complete. A separate message is sent to indicate this because the query string might contain multiple SQL commands. (CommandComplete marks the end of processing one SQL command, not the whole string.) ReadyForQuery will always be sent, whether processing terminates successfully or with an error.

The response to a SELECT query (or other queries that return row sets, such as EXPLAIN or SHOW) normally consists of RowDescription, zero or more DataRow messages, and then CommandComplete. COPY command invokes special protocol as described in Section COPY Operations. Other query types normally produce only a CommandComplete message.

Since a query string could contain several queries (separated by semicolons), there might be several such response sequences before the backend finishes processing the query string. ReadyForQuery is issued when the entire string has been processed and the backend is ready to accept a new query string.

If a completely empty (no contents other than whitespace) query string is received, the response is EmptyQueryResponse followed by ReadyForQuery.

In the event of an error, ErrorResponse is issued followed by ReadyForQuery. All further processing of the query string is aborted by ErrorResponse (even if more queries remained in it). Note that this might occur partway through the sequence of messages generated by an individual query.

Recommended practice is to code frontends in a state-machine style that will accept any message type at any time that it could make sense, rather than wiring in assumptions about the exact sequence of messages.

Extended Query

The extended query protocol breaks down the above-described simple query protocol into multiple steps. The overall execution cycle consists of a parse step, which creates a prepared statement from a textual query string; a bind step, which creates a portal given a prepared statement and values for any needed parameters; and an execute step that runs a portal’s query. One of the advantages of extended query protocol is preventing SQL injection attacks.

In the extended protocol, the frontend first sends a Parse message, which contains a textual query string. The query string leaves certain values (must be SQL literals; otherwise a syntax error is reported) unspecified with parameter placeholders (i.e. question mark ‘?’). The response is either ParseComplete or ErrorResponse.

Note: The query string contained in a Parse message cannot include more than one SQL statement; else a syntax error is reported. This restriction does not exist in the simple-query protocol.
The query string contained in a Parse message cannot be a COPY FROM LOCAL SQL statement; else a protocol error is reported. See Section Copy-Local mode.

If successfully created, a named prepared-statement object lasts till the end of the current session, unless explicitly destroyed. Within a session, the backend can keep track of multiple prepared statements. Existing prepared statements and portals are referenced by names assigned when they were created. The frontend can optimize its behavior when the same query string is used, but different parameters are bound to it (many times), i.e. parse once and then bind & execute many times.

Once a prepared statement exists, it can be readied for execution using a Bind message. The Bind message gives the name of the source prepared statement, the name of the destination portal (empty string denotes the unnamed portal), and the values to use for any parameter placeholders (’?’) present in the prepared statement. The supplied parameter set must match those needed by the prepared statement. Bind also specifies the format to use for any data returned by the query; the format can be specified overall, or per-column. The response is either BindComplete or ErrorResponse.

Note: In Vertica server, there is only one portal. The portal name required in some messages is actually ignored by the backend.

Once a portal exists, it can be executed using an Execute message. The Execute message specifies the portal name (empty string denotes the unnamed portal) and a maximum result-row count (Vertica backend ignores this and always fetch all rows). The possible responses to Execute are the same as those described above for queries issued via simple query protocol, except that Execute doesn’t cause ReadyForQuery or RowDescription to be issued, also PortalSuspended replaced CommandComplete to indicate completion of the source SQL command.

At completion of each series of extended-query messages, the frontend should issue a Sync message. This parameterless message causes the backend to close the current transaction if it’s not inside a BEGIN/COMMIT transaction block (“close” meaning to commit if no error, or roll back if error). Then a ReadyForQuery response is issued. The purpose of Sync is to provide a resynchronization point for error recovery. When an error is detected while processing any extended-query message, the backend issues ErrorResponse, then reads and discards messages until a Sync is reached, then issues ReadyForQuery and returns to normal message processing. (But note that no skipping occurs if an error is detected while processing Sync — this ensures that there is one and only one ReadyForQuery sent for each Sync.)

In most scenarios the frontend should issue a Describe before issuing Execute, to ensure that it knows how to interpret the results it will get back. The Describe message (statement variant) specifies the name of an existing prepared statement. The response is a ParameterDescription message describing the parameters needed by the statement, followed by a RowDescription message describing the rows that will be returned when the statement is eventually executed (or a NoData message if the statement will not return rows), and followed by a CommandDescription message describing the type of command to be executed and any semantically-equivalent COPY statement. ErrorResponse is issued if there is no such prepared statement. Note that since Bind has not yet been issued, the formats to be used for returned columns are not yet known to the backend; the format code fields in the RowDescription message will be zeroes in this case. The Describe message (portal variant) specifies the name of an existing portal (or an empty string for the unnamed portal). The response is a RowDescription message describing the rows that will be returned by executing the portal; or a NoData message if the portal does not contain a query that will return rows; or ErrorResponse if there is no such portal.

The Close message closes an existing prepared statement or portal and releases resources. It is not an error to issue Close against a nonexistent statement or portal name. The response is normally CloseComplete, but could be ErrorResponse if some difficulty is encountered while releasing resources. Note that closing a prepared statement implicitly closes any open portals that were constructed from that statement.

The Flush message does not cause any specific output to be generated, but forces the backend to deliver any data pending in its output buffers. A Flush must be sent after any extended-query command except Sync, if the frontend wishes to examine the results of that command before issuing more commands. Without Flush, messages returned by the backend will be combined into the minimum possible number of packets to minimize network overhead.

COPY Operations

The COPY command allows high-speed bulk data transfer from the client to the server. COPY operations can be divided into copy-local mode (it is further divided into copy-local-stdin mode and copy-local-file mode) and copy-stdin mode.

Copy-Local mode

Copy-local mode is initiated when the backend executes a COPY FROM LOCAL FILES or COPY FROM LOCAL STDIN SQL statement in Query message. The backend sends a RowDescription message (For the DataRow message later indicating the number of rows loaded) and a VerifyFiles message to the frontend. The backend parses the file names out of the command, and sends them back to the frontend in VerifyFiles message. If COPY command uses the REJECTED DATA and/or EXCEPTIONS parameters, VerifyFiles message contains filenames for them. In copy-local-file mode, VerifyFiles message also contains input filenames. The frontend has to verify that these input files exist and are readable, and rejected data and exceptions files are writable.

Then the frontend should send a VerifiedFile message specifying a list of input files that the backend can ask to load. In copy-local-stdin mode, this is an empty list. In copy-local-file mode, this must be a non-empty list, because sending an empty list of files will make server kill the session. If the backend does not send a ErrorResponse, it is ready to copy data from STDIN/files. The backend might send ParameterStatus messages, then

In copy-local-file mode, for each file to load, the backend sends a LoadFile message to ask for data from a file. The frontend should then send zero or more CopyData messages, forming a stream of input data. The message boundaries are not required to have anything to do with row boundaries, although that is often a reasonable choice. The frontend should send a EndOfBatchRequest message to indicate that a batch of rows has been sent, and the frontend is expecting an acknowledgment and possibly rejected row descriptions from the backend. The backend should then send zero or more WriteFile messages for rejected row descriptions and a EndOfBatchResponse message for acknowledgement of data loading from this file. When the backend finishes asking for data from all files, it sends a CopyDoneResponse message.

Example message flow of COPY FROM LOCAL FILE

In copy-local-stdin mode, the backend sends a CopyInResponse message to the frontend to ask for data. Same as in copy-local-file mode, the frontend should send zero or more CopyData messages, forming a stream of input data, and a EndOfBatchRequest message and then receive zero or more WriteFile messages and an EndOfBatchRequest message. In this mode, there will be no incoming message until the frontend send a CopyDone message to end this STDIN loading.

Example message flow of COPY FROM LOCAL STDIN

Security concerns: The frontend must verify the file the backend asks for reading (LoadFile) or writing (WriteFile) is not tampered.

After loading all the data, the backend sends a DataRow message indicating the number of rows loaded, and then a CommandComplete message indicating the end of the COPY command. The query string might contain multiple SQL commands, the backend will return to the command-processing mode of Simple Query protocol when a COPY command ended. ReadyForQuery will always be sent when the query string is complete.

In the event of a frontend-detected error during copy-local mode, the frontend can terminate it by sending a CopyError message, which will cause the COPY SQL statement to fail with an ErrorResponse. In the event of a backend-detected error (including receipt of a CopyError message), the backend will issue an ErrorResponse message, any subsequent messages issued by the frontend will simply be dropped, and ReadyForQuery is issued.

Copy-Stdin mode

This protocol comes from PostgreSQL and is (partially) supported in Vertica server but not implemented by most Vertica clients. The backend cannot report the number of rows loaded after copy, and possibly rejected row descriptions.

Copy-stdin mode is initiated when the backend executes a COPY FROM STDIN SQL statement (not “COPY FROM LOCAL STDIN”) in a Query message. The backend might send ParameterStatus messages, and then a CopyInResponse message to the frontend to ask for data. The frontend should send zero or more CopyData messages, forming a stream of input data, and a CopyDone message to end this STDIN loading. The backend should process the command and send a CommandComplete message indicating the end of the COPY command. The query string might contain multiple SQL commands, the backend will return to the command-processing mode of Simple Query protocol when a COPY command ended. ReadyForQuery will always be sent when the query string is complete.

Example message flow of COPY FROM STDIN

In the event of a frontend-detected error during copy-stdin mode, the frontend can terminate it by sending a CopyFail or a CopyError message, which will cause the COPY SQL statement to fail with an ErrorResponse. In the event of a backend-detected error (including receipt of a CopyFail or a CopyError message), the backend will issue an ErrorResponse message, any subsequent CopyData, CopyDone, CopyFail or CopyError messages issued by the frontend will simply be dropped, and ReadyForQuery is issued.

Copy-stdin mode can also be initiated via Extended Query protocol, the COPY command is issued in a Parse message. The workflow is the same as normal processing of Extended Query protocol except that the backend should send a CopyInResponse message after receiving an Execute message. Then the frontend should send zero or more CopyData messages, forming a stream of input data, and a CopyDone or CopyFail/CopyError message to end the STDIN loading.

Asynchronous Operations

There are several cases in which the backend will send messages that are not specifically prompted by the frontend’s command stream. Frontends must be prepared to deal with these messages at any time, even when not engaged in a query.

NoticeResponse: It is possible for NoticeResponse messages to be generated due to outside activity; for example, if the database administrator commands a “fast” database shutdown, the backend will send a NoticeResponse indicating this fact before closing the connection. Accordingly, frontends should always be prepared to accept and display NoticeResponse messages, even when the connection is nominally idle.
ParameterStatus: ParameterStatus messages will be generated whenever the active value changes for any of the parameters the backend believes the frontend should know about. For example, when you do SET SESSION AUTOCOMMIT ON | OFF, you get back a ParameterStatus telling you the new value of autocommit.; At present Vertica supports a handful of parameters for which ParameterStatus will be generated, they are: standard_conforming_strings, server_version, client_locale, client_label, long_string_types, protocol_version, auto_commit, mars, timezone, database_name, request_complex_types, extend_copy_reject_info etc.; More parameters might be added in the future. A frontend should simply ignore ParameterStatus for parameters that it does not understand or care about.

Canceling Requests in Progress

During the processing of a query, the frontend might request cancellation of the query. The cancel request is not sent directly on the open connection to the backend for reasons of implementation efficiency: we don’t want to have the backend constantly checking for new input from the frontend during query processing. Cancel requests should be relatively infrequent, so we make them slightly cumbersome in order to avoid a penalty in the normal case.

To issue a cancel request, the frontend opens a new connection to the server and sends a CancelRequest message, rather than the StartupRequest message that would ordinarily be sent across a new connection. The server will process this request and then close the connection. For security reasons, no direct reply is made to the CancelRequest message.

A CancelRequest message will be ignored unless it contains the same key data (PID and secret key) passed to the frontend during connection start-up. If the request matches the PID and secret key for a currently executing backend, the processing of the current query is aborted.

The cancellation signal might or might not have any effect — for example, if it arrives after the backend has finished processing the query, then it will have no effect. If the cancellation is effective, it results in the current command being terminated early with an error message saying “Execution canceled by operator”.

The upshot of all this is that for reasons of both security and efficiency, the frontend has no direct way to tell whether a cancel request has succeeded. It must continue to wait for the backend to respond to the query. Issuing a cancel simply improves the odds that the current query will finish soon, and improves the odds that it will fail with an error message instead of succeeding.

Since the cancel request is sent across a new connection to the server and not across the regular frontend/backend communication link, it is possible for the cancel request to be issued by any process, not just the frontend whose query is to be canceled. This might provide additional flexibility when building multiple-process applications. It also introduces a security risk, in that unauthorized persons might try to cancel queries. The security risk is addressed by requiring a dynamically generated secret key to be supplied in CancelRequest.

Termination

The normal, graceful termination procedure is that the frontend sends a Terminate message and immediately closes the connection. On receipt of this message, the backend closes the connection and terminates.

In rare cases (such as an administrator-commanded database shutdown) the backend might disconnect without any frontend request to do so. In such cases the backend will attempt to send an ErrorResponse or NoticeResponse message giving the reason for the disconnection before it closes the connection.

Session Transfer

When admin initiates session transfer request to the server:

SELECT pause_client_connections('subcluster_name', '{"config_key":"config_val"}');
SELECT redirect_client_connections('subcluster_name', 'new_hostname_or_ip:port', '{"config_key":"config_val"}');

Server will pause all currently active client connections once the current transaction finishes and results are done being read, then the server sends a SessionRedirect message in order to redirect the client to a new node to resume the session. This feature is enabled when ‘session_transfer_support’ in the StartupRequest message is set to true.

Once the client parses SessionRedirect message and get

new host and port to reconnect to
session information: a binary payload that is opaque to the client

It can connect to the new node, go through load balancing and TLS negotiation, and send a new StartupRequest message. In StartupRequest message, ‘auth_category’ parameter should be set to “SessionResume”. The server will send a AuthenticationSessionTransfer message and expect the client to send session information within a Password message.

<TODO: message flow graph>

Lastly, if admin wants to start allowing use of the paused subcluster again, resume it with:

SELECT resume_client_connections('subcluster_name', '{"config_key":"config_val"}');

Message Data Types

This section describes the base data types used in messages.

Intn(i): An n-bit integer in network byte order (most significant byte first). If i is specified it is the exact value that will appear, otherwise the value is variable. Eg. Int16, Int32(42).
Intn[k]: An array of k n-bit integers, each in network byte order. The array length k is always determined by an earlier field in the message. Eg. Int16[M].
String(s): A null-terminated string (C-style string). There is no specific length limitation on strings. If s is specified it is the exact value that will appear, otherwise the value is variable. Eg. String, String(“user”).

Note: There is no predefined limit on the length of a string that can be returned by the backend. Good coding strategy for a frontend is to use an expandable buffer so that anything that fits in memory can be accepted. If that’s not feasible, read the full string and discard trailing characters that don’t fit into your fixed-size buffer.

String[k]: An array of k null-terminated strings (C-style strings). The array length k is always determined by an earlier field in the message. Eg. String[N].
Byten(c): Exactly n bytes. If the field width n is not a constant, it is always determinable from an earlier field in the message. If c is specified it is the exact value. Eg. Byte2, Byte1(’\n’).

Message Formats

This section describes the detailed format of each message. Each message is classified to indicate that it may be sent by a frontend (client), or a backend (server). In most cases, the first byte of a message identifies the message type, and the next four bytes specify the length of the rest of the message (this length count includes itself, but not the message-type byte). The remaining contents of the message are determined by the message type. There are a few message types that have no initial message-type byte.

Frontend

Bind ‘B’

Type	Description
Byte1(‘B’)	Identifies the message as a Bind command.
Int32	Length of message contents in bytes, including self.
String	The name of the destination portal (an empty string selects the unnamed portal).
String	The name of the source prepared statement (an empty string selects the unnamed prepared statement).
Int16	The number of parameter format codes that follow (denoted C below). This can be zero to indicate that there are no parameters or that the parameters all use the default format (text); or one, in which case the specified format code is applied to all parameters; or it can equal the actual number of parameters.
Int16[C]	The parameter format codes. Each must presently be zero (text) or one (binary).
Int16	The number of parameter values that follow (possibly zero) (denoted P below). This must match the number of parameters needed by the query.
Int32[P]	The parameter type Oids.
Next, the following pair of fields appear for each parameter:
Int32	The length of the parameter value, in bytes (this count does not include itself) (denoted n below). Can be zero. As a special case, -1 indicates a NULL parameter value. No value bytes follow in the NULL case.
Byten	The value of the parameter, in the format indicated by the associated format code.
After the last parameter, the following fields appear:
Int16	The number of result-column format codes that follow (denoted R below). This can be zero to indicate that there are no result columns or that the result columns should all use the default format (text); or one, in which case the specified format code is applied to all result columns (if any); or it can equal the actual number of result columns of the query.
Int16[R]	The result-column format codes. Each must presently be zero (text) or one (binary).

CancelRequest

Type	Description
Int32(16)	Length of message contents in bytes, including self.
Int32(80877102)	The cancel request code. The value is chosen to contain 1234 in the most significant 16 bits, and 5678 in the least 16 significant bits. (To avoid confusion, this code must not be the same as any protocol version number.)
Int32	The process ID of the target backend.
Int32	The secret key for the target backend.

ChangePassword ’n’

Type	Description
Byte1(’n’)	Identifies the message as a ChangePassword command.
Int32	Length of message contents in bytes, including self.
String	The new password.

Close ‘C’

Type	Description
Byte1(‘C’)	Identifies the message as a Close command.
Int32	Length of message contents in bytes, including self.
Byte1	‘S’ to close a prepared statement; or ‘P’ to close a portal.
String	The name of the prepared statement or portal to close (an empty string selects the unnamed prepared statement or portal).

CopyData ’d’

Type	Description
Byte1(’d’)	Identifies the message as COPY data.
Int32	Length of message contents in bytes, including self.
Byten	Data that forms part of a COPY data stream. Messages sent by frontends may divide the data stream arbitrarily.

CopyDone ‘c’

Type	Description
Byte1(‘c’)	Identifies the message as a COPY-complete indicator. Note: This is different from a CopyDoneResponse backend message.
Int32(4)	Length of message contents in bytes, including self.

CopyError ’e’

Type	Description
Byte1(’e’)	Identifies the message as a copy error. Terminate the Copy-Local protocol.
Int32	Length of message contents in bytes, including self.
String	Error file name.
Int32	Error line number.
String	Error method name.
String	Error message.

CopyFail ‘f’

Type	Description
Byte1(‘f’)	Identifies the message as a COPY-failure indicator.
Int32	Length of message contents in bytes, including self.
String	An error message to report as the cause of failure.

Describe ‘D’

Type	Description
Byte1(‘D’)	Identifies the message as a Describe command.
Int32	Length of message contents in bytes, including self.
Byte1	‘S’ to describe a prepared statement; or ‘P’ to describe a portal.
String	The name of the prepared statement or portal to describe (an empty string selects the unnamed prepared statement or portal).

EndOfBatchRequest ‘j’

Type	Description
Byte1(‘j’)	Identifies the message as a EndOfBatchRequest command. Signals that a batch of rows has been sent, and the client is expecting an acknowledgment and possibly rejected row descriptions from the server.
Int32(4)	Length of message contents in bytes, including self.

Execute ‘E’

Type	Description
Byte1(‘E’)	Identifies the message as an Execute command.
Int32	Length of message contents in bytes, including self.
String	The name of the portal to execute (an empty string selects the unnamed portal).
Int32	Maximum number of rows to return, if portal contains a query that returns rows (ignored otherwise). Zero denotes “no limit”. Note: Currently, Vertica backend will ignore this result-row count and send all the rows regardless of what you put here.

Flush ‘H’

Type	Description
Byte1(‘H’)	Identifies the message as a Flush command.
Int32(4)	Length of message contents in bytes, including self.

LoadBalanceRequest

Type	Description
Int32(8)	Length of message contents in bytes, including self.
Int32(80936960)	The LoadBalance request code. The value is chosen to contain 1235 in the most significant 16 bits, and 0000 in the least 16 significant bits. (To avoid confusion, this code must not be the same as any protocol version number.)

MarsRequest ‘_’

Type	Description
Byte1(’_’)	Identifies the message as a MARS request.
Int32(20)	Length of message contents in bytes, including self.
Int32	ResultSet ID
Int32	RequestType: Fetch(0x1) Close(0x2) NoRowDesc(0x4)
Int64	FetchCount

Parse ‘P’

Type	Description
Byte1(‘P’)	Identifies the message as a Parse command.
Int32	Length of message contents in bytes, including self.
String	The name of the destination prepared statement (an empty string selects the unnamed portal, i.e. prepared statement must be named).
String	The query string to be parsed.
Int16	The number of parameter data types specified (may be zero). Note that this is not an indication of the number of parameters that might appear in the query string, only the number that the frontend wants to pre-specify types for.
Then, for each parameter, there is the following:
Int32	Specifies the object ID of the parameter data type. Placing a zero here is equivalent to leaving the type unspecified. The server will ignore these values.

Password ‘p’

Type	Description
Byte1(‘p’)	Identifies the message as a password.
Int32	Length of message contents in bytes, including self.
String	The password (encrypted, if requested). All password strings processed by Vertica require a null terminator, as is standard with any string in the Front End / Back End protocol. GSS is special because Vertica doesn’t touch the buffer it gets – it defers to the GSS library instead. It is an error to send an extra null terminator in a GSS token message.

Query ‘Q’

Type	Description
Byte1(‘Q’)	Identifies the message as a simple query.
Int32	Length of message contents in bytes, including self.
String	The query string itself.

SSLRequest

Type	Description
Int32(8)	Length of message contents in bytes, including self.
Int32(80877103)	The SSL request code. The value is chosen to contain 1234 in the most significant 16 bits, and 5679 in the least 16 significant bits. (To avoid confusion, this code must not be the same as any protocol version number.)

StartupRequest

Type

Description

Int32

Length of message contents in bytes, including self.

Int32

The fixed protocol version number. The most significant 16 bits are the major version number. The least significant 16 bits are the minor version number. For example, protocol version number 3.8 should be described as (3 << 16 | 8) = 196616.

Frontend and backend must negotiate and use the same protocol version for communication. For historical reasons, in this message, there are two protocol version numbers send to the backend: fixed protocol version and requested protocol version. The behavior depends on the protocol version of the backend:

New servers (protocol version >= 3.7)

: The backend would use the frontend's requested protocol version to find the highest common protocol version in use for both frontend and backend, and send this effective protocol version back in a ParameterStatus message, with parameter name 'protocol_version'. The frontend can use this value to enable/disable functionalities.

Old servers (protocol version < 3.7)

: The backend ignores the requested version and only looks at the fixed protocol version. The backend accepts the connection if frontend's protocol version is equal or older than the backend's.

Next, the following pair of fields appears for each parameter. Parameters can appear in any order.

String

The parameter name. The following table contains the definition for currently recognized parameter names:

Parameter name

Description

user

The database user name to connect as. This is required unless a value is supplied for oauth_access_token, in which case the user can be an empty string. Otherwise it is required. There is no default.

database

The database to connect to.

protocol_version

The requested protocol version number, which is the highest version the frontend supports.

binary_data_protocol

'0' is the default. If set to '1', enable this session-level setting. For additional details, see Formats and Format Codes.

autocommit

'off' is the default. If set to 'on', enable AUTOCOMMIT setting.

client_label

The client connection label. You can use this label to distinguish client connections.

client_type

A string distinguish the type of this client. E.g. "JDBC Driver", "ADO.NET Driver".

client_version

A string distinguish the version of this client. Note that this is not required to follow the Vertica version number scheme. The open source drivers vertica-python and vertica-sql-go have independent version numbers.

client_pid

A string distinguish the process ID of this client.

client_os

A string distinguish the client's OS platform.

client_os_user_name

A string distinguish the OS login username of this client.

client_os_hostname

A string distinguish the OS hostname of this client.

mars

'off' is the default. If set to 'on', enable MARS setting.

oauth_access_token

An OAuth token that authorizes a user to the database. Specifying a value triggers OAuth authentication.

auth_category

A string indicating the type of authentication the client is prepared to do. Recognized values are "User", "Kerberos", "OAuth" and "SessionResume". Specify only one type at a time.

protocol_features

A JSON string of features requested by the driver. All features are disabled by default. Each feature enabled by the server will return a ParameterStatus message. These features are not bind with specific protocol version range. Currently supports the following features:

Feature name	Description	Support since
request_complex_types	If set to true, server will return complex type metadata. Otherwise, complex types will be treated as long varchar.	Server v12.0SP2
extend_copy_reject_info	If set to true, server will return a modified format of the WriteFile message when the RETURNREJECTED keyword is used in a copy command. This allows full diagnostic results from a failed copy command to be identified row by row.	Server v23.3.0
session_transfer_support	If set to true, server will send a SessionRedirect message when admin initiates session transfer request. Otherwise, client will be forcibly disconnected.	Server v?.?.0

Example value: {"request_complex_types":true,"extend_copy_reject_info":false,"session_transfer_support":true}

protocol_compat

A string representing the protocol that the driver is able to understand. Used in postgres compatibility. The server will try and determine the type of driver and which protocol to be used. If this parameter is provided, the server will defer to the value given instead of determining which protocol to use on its own. This may be used in conversion and extension of PG drivers to understand parts of the Vertica protocol.
Currently recognized values for protocol_compat are "PG" or "VER" for Postgres and Vertica respectively. If no parameter is passed, the server will attempt to determine the value based on other parameters provided.

workload

A string representing the workload name to be used by workload routing rules.

String

The parameter value. Most parameter values are of type String except for the protocol_version parameter. It is of type Int32 followed by an extra null terminator ('\0').

And finally:

Byte1('\0')

A zero byte is required as a terminator after the last name/value pair.

Sync ‘S’

Type	Description
Byte1(‘S’)	Identifies the message as a Sync command.
Int32(4)	Iength of message contents in bytes, including self.

Terminate ‘X’

Type	Description
Byte1(‘X’)	Identifies the message as a termination.
Int32(4)	Length of message contents in bytes, including self.

VerifiedFiles ‘F’

Type	Description
Byte1(‘F’)	Identifies the message as a response to VerifyFiles ‘F’ message. Provide a list of files that the backend can ask to load.
Int32	Length of message contents in bytes, including self.
Int32	The number of files. 0 if copy input is STDIN. Note: Before v3.15 this is a Int16 type.
Next, the following pair of fields appear for each file:
String	File name. The string is null terminated.
Int64	File content length in bytes.

Backend

AuthenticationOk ‘R’

Type	Description
Byte1(‘R’)	Identifies the message as an authentication request.
Int32(8)	Length of message contents in bytes, including self.
Int32(0)	Specifies that the authentication was successful.

AuthenticationCleartextPassword ‘R’

Type	Description
Byte1(‘R’)	Identifies the message as an authentication request.
Int32(8)	Length of message contents in bytes, including self.
Int32(3)	Specifies that a clear-text password is required.

AuthenticationMD5Password ‘R’

Type	Description
Byte1(‘R’)	Identifies the message as an authentication request.
Int32(32)	Length of message contents in bytes, including self.
Int32(5)	Specifies that an MD5-encrypted password is required. password = MD5(MD5(password + user) + salt)
Byte4	The salt to use when encrypting the password.
Int32	The user salt length, must equal to 16.
Byte16	The user salt, but not used.

AuthenticationGSS ‘R’

Type	Description
Byte1(‘R’)	Identifies the message as an authentication request.
Int32(8)	Length of message contents in bytes, including self.
Int32(7)	Specifies that GSS authentication is required.

AuthenticationGSSContinue ‘R’

Type	Description
Byte1(‘R’)	Identifies the message as an authentication request.
Int32	Length of message contents in bytes, including self.
Int32(8)	Specifies that GSS authentication continue is required.
Byten	GSS data, which should be passed into an authGSSClientStep.

AuthenticationPasswordExpired ‘R’

Type	Description
Byte1(‘R’)	Identifies the message as an authentication request.
Int32	Length of message contents in bytes, including self.
Int32(9)	Specifies that the password is expired.
Int32[n]	A bunch of integers that represent some restrictions on the password (min/max length, character classes, etc).

AuthenticationPasswordChanged ‘R’

Type	Description
Byte1(‘R’)	Identifies the message as an authentication request.
Int32(8)	Length of message contents in bytes, including self.
Int32(10)	Specifies that the password is changed.

AuthenticationPasswordGrace ‘R’

Type	Description
Byte1(‘R’)	Identifies the message as an authentication request.
Int32(8)	Length of message contents in bytes, including self.
Int32(11)	Specifies that the password is in its grace period.

AuthenticationOAuth ‘R’

Type	Description
Byte1(‘R’)	Identifies the message as an authentication request.
Int32(8)	Length of message contents in bytes, including self.
Int32(12)	Specifies that an OAuth access token is required.
String	OAuth Auth URL. New in version 3.15
String	OAuth Token URL. New in version 3.15
String	OAuth Client ID. New in version 3.15
String	OAuth Scope. New in version 3.16
String	OAuth validate hostname. New in version 3.16

AuthenticationSessionTransfer ‘R’

Type	Description
Byte1(‘R’)	Identifies the message as an authentication request.
Int32(8)	Length of message contents in bytes, including self.
Int32(13)	Specifies that session transfer info is required.

AuthenticationHashPassword ‘R’

Type	Description
Byte1(‘R’)	Identifies the message as an authentication request.
Int32(32)	Length of message contents in bytes, including self.
Int32(65536)	Specifies that a hashed password is required. password = SHA512(SHA512(password + userSalt) + salt)
Byte4	The salt to use when encrypting the password.
Int32	The user salt length, must equal to 16.
Byte16	The user salt.

AuthenticationHashMD5Password ‘R’

Type	Description
Byte1(‘R’)	Identifies the message as an authentication request.
Int32(32)	Length of message contents in bytes, including self.
Int32(65541)	Specifies that a hashed MD5-encrypted password is required. password = MD5(MD5(password + user) + salt)
Byte4	The salt to use when encrypting the password.
Int32	The user salt length, must equal to 16.
Byte16	The user salt, but not used.

AuthenticationHashSHA512Password ‘R’

Type	Description
Byte1(‘R’)	Identifies the message as an authentication request.
Int32(32)	Length of message contents in bytes, including self.
Int32(66048)	Specifies that a hashed SHA512-encrypted password is required. password = SHA512(SHA512(password + userSalt) + salt)
Byte4	The salt to use when encrypting the password.
Int32	The user salt length, must equal to 16.
Byte16	The user salt.

BackendKeyData ‘K’

Type	Description
Byte1(‘K’)	Identifies the message as cancellation key data. The frontend must save these values if it wishes to be able to issue CancelRequest messages later.
Int32(12)	Length of message contents in bytes, including self.
Int32	The process ID of this backend.
Int32	The secret key of this backend.

BindComplete ‘2’

Type	Description
Byte1(‘2’)	Identifies the message as a Bind-complete indicator.
Int32(4)	Length of message contents in bytes, including self.

CloseComplete ‘3’

Type	Description
Byte1(‘3’)	Identifies the message as a Close-complete indicator.
Int32(4)	Length of message contents in bytes, including self.

CommandComplete ‘C’

Type	Description
Byte1(‘C’)	Identifies the message as a command-completed response.
Int32	Length of message contents in bytes, including self.
String	The command tag. This is usually a single word that identifies which SQL command was sent.

CommandDescription ’m’

Type	Description
Byte1(’m’)	Identifies the message as a command description.
Int32	Length of message contents in bytes, including self.
String	The command tag. This is usually a single word that identifies which SQL command was sent.
Int16	1 if command was a prepared INSERT that can be converted to a COPY STDIN statement; 0 otherwise.
String	The COPY STDIN statement if it was possible to convert the prepared INSERT into a COPY.

CopyDoneResponse ‘c’

Type	Description
Byte1(‘c’)	Identifies the message as a Copy done response for “COPY FROM LOCAL FILES” command. Note: This is different from a CopyDone frontend message.
Int32(4)	Length of message contents in bytes, including self.

CopyInResponse ‘G’

Type	Description
Byte1(‘G’)	Identifies the message as a Start Copy In response.
Int32	Length of message contents in bytes, including self.
Int8	0 indicates the overall COPY format is textual (rows separated by newlines, columns separated by separator characters, etc). 1 indicates the overall copy format is binary (similar to DataRow format).
Int16	The number of columns in the data to be copied (denoted N below).
Int16[N]	The format codes to be used for each column. Each must presently be zero (text) or one (binary). All must be zero if the overall copy format is textual.

DataRow ‘D’

Type	Description
Byte1(‘D’)	Identifies the message as a data row.
Int32	Length of message contents in bytes, including self.
Int16	The number of column values that follow (possibly zero).
Next, the following pair of fields appear for each column:
Int32	The length of the column value, in bytes (this count does not include itself) (denoted n below). Can be zero. As a special case, -1 indicates a NULL column value. No value bytes follow in the NULL case.
Byten	The value of the column, in the format indicated by the associated format code.

EmptyQueryResponse ‘I’

Type	Description
Byte1(‘I’)	Identifies the message as a response to an empty query string. This substitutes for CommandComplete message.
Int32(4)	Length of message contents in bytes, including self.

EndOfBatchResponse ‘J’

Type	Description
Byte1(‘J’)	Signals that the server is done sending back rejected rows, and is ready for the next set of rows to be sent (or the copy end message).
Int32(4)	Length of message contents in bytes, including self.

ErrorResponse ‘E’

Type	Description
Byte1(‘E’)	Identifies the message as an error.
Int32	Length of message contents in bytes, including self.
The message body consists of one or more identified fields, followed by a zero byte as a terminator. Fields may appear in any order. For each field there is the following:
Byte1	A code identifying the field type; if zero, this is the message terminator and no string follows. The presently defined field types are listed in Error and Notice Message Fields section. Since more field types may be added in future, frontends should silently ignore fields of unrecognized type.
String	The field value.

LoadBalanceResponse ‘Y/N’

Type	Description
Byte1(‘N’)	Identifies the message as a LoadBalance response. The backend rejects the frontend LoadBalance request. Note: This is different from a NoticeResponse ‘N’ message.

Type	Description
Byte1(‘Y’)	Identifies the message as a LoadBalance response. The backend accepts the frontend LoadBalance request.
Int32	Length of message contents in bytes, including self.
Int32	The port number of the load balance target.
String	The host of the load balance target. The string is null terminated. This is an IP address, not a DNS name.

LoadFile ‘H’

Type	Description
Byte1(‘H’)	Identifies the message as a Copy-Local response to load the data from a file.
Int32	Length of message contents in bytes, including self.
String	The name of the file to load.

MarsResponse ‘_’

Type	Description
Byte1(’_’)	Identifies the message as a MARS response.
Int32(20)	Length of message contents in bytes, including self.
Int32	ResultSet ID.
Int32	Status: Created(0x1) Fetched(0x2) Closed(0x4) Canceled(0x8) Error(0x10)
Int64	Remaining rowcount.

NoData ’n’

Type	Description
Byte1(’n’)	Identifies the message as a no-data indicator. It is a response for transactions/DDL.
Int32(4)	Length of message contents in bytes, including self.

NoticeResponse ‘N’

Type	Description
Byte1(‘N’)	Identifies the message as a notice. Note: This is different from a LoadBalanceResponse ‘N’ message.
Int32	Length of message contents in bytes, including self.
The message body consists of one or more identified fields, followed by a zero byte as a terminator. Fields may appear in any order. For each field there is the following:
Byte1	A code identifying the field type; if zero, this is the message terminator and no string follows. The presently defined field types are listed in Error and Notice Message Fields section. Since more field types may be added in future, frontends should silently ignore fields of unrecognized type.
String	The field value.

ParameterDescription ’t’

Type	Description
Byte1(’t’)	Identifies the message as a parameter description.
Int32	Length of message contents in bytes, including self.
Int16	The number of parameters used by the statement (may be zero).
Int32	The number of elements in the type mapping pool, which is an array. This pool is used for non-native types like GEOMETRY and GEOGRAPHY (They are backed by LONG VARBINARY native type).
Then, for each type mapping element in the pool, there is the following:
Int32	Specifies the base object ID of the non-native data type.
String	Specifies the name of the non-native data type.
Then, for each parameter, there is the following:
Byte1	‘1’ if the parameter uses the type mapping pool; ‘0’ otherwise.
Int32	If the parameter uses the type mapping pool, specifies the position (0-indexed) in the type mapping pool. Otherwise, specifies the object ID of the parameter data type.
Int32	Specifies the parameter type modifier.
Int16	1 if the column has a NOT NULL constraint; 0 otherwise.

ParameterStatus ‘S’

Type	Description
Byte1(‘S’)	Identifies the message as a run-time parameter status report.
Int32	Length of message contents in bytes, including self.
String	The name of the run-time parameter being reported.
String	The current value of the parameter.

ParseComplete ‘1’

Type	Description
Byte1(‘1’)	Identifies the message as a Parse-complete indicator.
Int32(4)	Length of message contents in bytes, including self.

PortalSuspended ’s’

Type	Description
Byte1(’s’)	Indicates that a portal has stopped execution.
Int32(4)	Length of message contents in bytes, including self.

ReadyForQuery ‘Z’

Type	Description
Byte1(‘Z’)	ReadyForQuery is sent whenever the backend is ready for a new query cycle.
Int32(5)	Length of message contents in bytes, including self.
Byte1	Current backend transaction status indicator. Possible values are ‘I’ if idle (not in a transaction block); ‘T’ if in a transaction block; or ‘E’ if in a failed transaction block (queries will be rejected until block is ended).

RowDescription ‘T’

Type	Description
Byte1(‘T’)	Identifies the message as a row description.
Int32	Length of message contents in bytes, including self.
Int16	Specifies the number of fields in a row (may be zero).
Int32	The number of elements in the type mapping pool, which is an array. This pool is used for non-native types like GEOMETRY and GEOGRAPHY (They are backed by LONG VARBINARY native type).
Then, for each type mapping element in the pool, there is the following:
Int32	Specifies the base object ID of the non-native data type.
String	Specifies the name of the non-native data type.
Then, for each field in a row, there is the following:
String	The field name.
Int64	If the field can be identified as a column of a specific table, the object ID of the table; otherwise zero. (denoted *OID* below).
String	The schema name. Only sent if *OID* != 0.
String	The table name. Only sent if *OID* != 0.
Int16	If the field can be identified as a column of a specific table, the attribute number of the column; otherwise zero. The *OID* and the *attribute number* uniquely identify each field within the scope of a RowDescription message.
Int16	If the column is a complex type, the *attribute number* of the field that is considered to be the parent; otherwise zero. For example, an ARRAY[INT] will have one attribute number representing the ARRAY, and another attribute number representing the INT it contains. The INT will have a Parent that refers to the ARRAY. New in version 3.10 (for JDBC only) & in version 3.12 (for all drivers when ‘request_complex_types’ in the StartupRequest message is set to true): Previously, it only returned fields that did not have a parent.
Byte1	‘1’ if the column uses the type mapping pool; ‘0’ otherwise.
Int32	If the column uses the type mapping pool, specifies the position (0-indexed) in the type mapping pool. Otherwise, specifies the object ID of the column’s data type.
Int16	The data type size.
Int16	1 if the column can contain null values; 0 otherwise.
Int16	1 if the column is identity; 0 otherwise.
Int32	The type modifier. The meaning of the modifier is type-specific.
Int16	The format code being used for the field.

SessionRedirect ‘r’

Type	Description
Byte1(‘r’)	Identifies the message as a session redirect command.
Int32	Length of message contents in bytes, including self.
String	Host that this session is being redirected to.
Int32	Port that this session is being redirected to.
Int64	Length of session information, in bytes (this count does not include itself) (denoted n below).
Byten	The value of session information to be passed to the new host and port.

VerifyFiles ‘F’

Type	Description
Byte1(‘F’)	Identifies the message as a response to COPY FROM LOCAL command. The backend parses the file names out of the command, and sends them back to the frontend in this message. The frontend has to verify that these files exist and are readable before running the copy.
Int32	Length of message contents in bytes, including self.
Int16	The number of files (denoted N below). 0 if copy input is STDIN.
String[N]	The name of each data file to load.
String	The file name of Rejects.
String	The file name of Exceptions.

WriteFile ‘O’

Type	Description
Byte1(‘O’)	Identifies the message as a response to COPY FROM LOCAL command.
Int32	Length of message contents in bytes, including self.
String	A file name if the command uses the REJECTED DATA and/or EXCEPTIONS parameters. Empty if the command uses the RETURNREJECTED parameters.
Int32	File length.
String	File content (not null-terminated). Note: If the command uses the RETURNREJECTED parameter, file content (i.e. rejected row numbers) comes in little-endian Int64 format. If the command doesn’t use the RETURNREJECTED parameter and file content are very large (>8192 bytes), then file content are sent in chunks of 8192 bytes in multiple Writefile messages.

The modified WriteFile format when ’extend_copy_reject_info’ in the StartupRequest message is set to true and the command uses the RETURNREJECTED parameter:

Type	Description
Byte1(‘O’)	Identifies the message as a response to COPY FROM LOCAL … RETURNREJECTED command.
Int32	Length of message contents in bytes, including self.
String	File name. Empty because the command uses the RETURNREJECTED parameter.
Int32	File length.
Then, for each rejected row:
Int64	The rejected row number. little-endian format.
Int32	Rejected message length. little-endian format.
String	Rejected message content (not null-terminated).

Error and Notice Message Fields

This section describes the fields that may appear in ErrorResponse and NoticeResponse messages. Each field type has a single-byte identification token. Note that any given field type should appear at most once per message.

Identification token	Description
C	Code: the five-character SQLSTATE code for the error. See the “SQL state list” in Vertica Documentation for a list of all the SQLSTATE classes and values defined by Vertica.
D	Detail: additional information about the error message, in greater detail.
F	File: the file name of the source-code location where the error was reported.
H	Hint: an optional suggestion what to do about the problem. This is intended to differ from Detail in that it offers advice (potentially inappropriate) rather than hard facts. Might run to multiple lines.
L	Line: the line number of the source-code location where the error was reported.
M	Message: the primary human-readable error message. Always present.
P	Position
p	Internal position
q	Internal query
R	Routine: the name of the source-code routine reporting the error.
S	Severity: the field contents are ERROR, FATAL, or PANIC (in an error message), or WARNING, NOTICE, DEBUG, INFO, or LOG (in a notice message), or a localized translation of one of these. Always present.
V	Error code: the numeric error code that Vertica reports.
W	Where: an indication of the context in which the error occurred.

Data Collector utility

Data Collector component ClientServerMessages (table V_INTERNAL.dc_client_server_messages) collects a subset of Client-Server Protocol Messages sent between the Front End and Back End. Not all protocol messages are currently logged by the Data Collector. Additionally, some messages has verbose data available.

Logged Frontend Messages (FE)

Message (Character ID)	Contents logged
Query (‘Q’)	Masked Query String
Parse (‘P’)	statement name: masked query string
Bind (‘B’)	(empty)
Execute (‘E’)	portal name
Close (‘C’)	prepared statement/portal name
Describe (‘D’)	prepared statement/portal name
Flush (‘H’)	(empty)
Sync (‘S’)	(empty)
Terminate (‘X’)	(empty)
Startup Request (’^’)	(empty)
Mars Request (’_’)	(empty)
SSL Request (’*’)	(empty)
Load Balance Request (’?’)	(empty)
Cancel Request (’!’)	“Canceled Session” [canceled session ID]
Copy Data (’d’)	(empty)
Copy Done (‘c’)	(empty)
Copy Error (’e’)	(empty)
Copy Fail (‘f’)	(empty)
End of Batch Request (‘j’)	(empty)

Logged Backend Messages (BE)

Message (Character ID)	Contents logged
Parse Complete (‘1’)	(empty)
Bind Complete (‘2’)	(empty)
Close Complete (‘3’)	(empty)
Portal Suspended (’s’)	(empty)
Parameter Description (’t’)	(empty)
No Data (’n’)	(empty)
Load Balance Response(‘Y’/‘N’)	“load balance response” [yes/no]
Parameter Status (‘S’)	parameter name: parameter value
Command Complete (‘C’)	tag
Write File (‘O’)	(empty)
Load File (‘H’)	(empty)
Verify Files (‘F’)	(empty)
End of Batch Response (‘J’)	(empty)
Copy Done (‘c’)	(empty)

Setup and Usage

This DC Table is turned on by default. As with other session DC tables, this table contains columns for ’time’, ’node_name’, ‘session_id’, ‘user_id’, and ‘user_name’. The unique columns for this DC table are ‘source’, ‘message_type’, and ‘contents’.

The ‘source’ column describes whether the messages is a Frontend (FE) or Backend (BE) message.
The ‘message_type’ column provides the character ID that distinguishes the type of message. Additionally, verbose messages will have a “+” appended to the single character message_type column. For example, the StartupRequest message KV pairs each are represented as a row in the table with the message_type “^+”.
The ‘contents’ column lists the set of contents being logged, if any, for that message type.

Sample Query

=> SELECT source, message_type, contents FROM DC_CLIENT_SERVER_MESSAGES;

Clearing the DC Table

=> SELECT clear_data_collector('ClientServerMessages');

Support since Server v12.0SP3

Server Debug Logging

It is possible to enable the protocol debug log for the server. Execute the following SQL statement on the node to which the client connects:

=> SELECT set_debug_log('PROTOCOL','BASIC');

Now, the debug log messages in vertica.log can be filtered by the transaction ID.

Then, execute the following SQL statement to disable the protocol debug log after executing the client program:

=> SELECT clear_debug_log('PROTOCOL','BASIC');

Summary of Changes since Protocol 3.0

Protocol 3.16

Changes include:

Session transfer support:
- add SessionRedirect, AuthenticationSessionTransfer messages.
- new feature ‘session_transfer_support’ in StartupRequest message ‘protocol_features’ parameter.
- new valid value “SessionResume” in StartupRequest message ‘auth_category’ parameter.

Support since Server v2?.?.0

Changes include:

OAuth 2.0 Authentication enhancement: Format change in the AuthenticationOAuth message.

Support since Server v24.1.0

Protocol 3.15

Changes include:

OAuth 2.0 Authentication enhancement: Format change in the AuthenticationOAuth message to support OAuth browser workflow.
Workload support. Format change in the StartupRequest message. New ‘workload’ parmeter allowing specification of workload name to be used by workload routing rules.
Format change in VerifiedFiles message.
Format change in WriteFile message when new ‘protocol_features’ parameter called ’extend-copy-reject-info’ in StartupRequest is set to true. In 3.15 only set to true by ODBC driver. This allows the ODBC driver users to properly get rejected row information for bulk loaded data through calls to SQLGetDiagRec.
New Frontend protocol messages logged in DC_Client_Server_Messages table: CopyData, CopyDone, CopyFail, CopyError, EndOfBatchRequest
New Backend protocol messages logged in DC_Client_Server_Messages table: WriteFile, LoadFile, VerifyFiles, EndOfBatchResponse, CopyDone

Support since Server v23.3.0

Protocol 3.14

Changes include:

client_os_hostname support. Format change in the StartupRequest message. New ‘client_os_hostname’ parameter. Allow client’s OS hostname to be recorded in Sessions Table.

Support since Server v12.0SP4

Protocol 3.13

Changes include:

Postgres compatibility support. Format change in the StartupRequest message. New ‘protocol_compat’ parameter allowing specification of protocol understood by the driver.

Support since Server v12.0SP3

Protocol 3.12

Changes include:

Extended Complex Type support. Format change in the StartupRequest message. New ‘protocol_features’ parameter supporting ‘request_complex_types’.
Fall-Through Authentication Filtering: Format change in the StartupRequest message. New ‘auth_category’ parameter.
OAuth 2.0 Authentication enhancement: add AuthenticationOAuth message.

Support since Server v12.0SP2

Protocol 3.11

Changes include:

OAuth 2.0 Authentication support: Format change in the StartupRequest message. New ‘oauth_access_token’ parameter.

Support since Server v11.1SP1

Protocol 3.10

Changes include:

Complex Types support (JDBC only): Format change in the RowDescription message. OID change for 1D Arrays & Sets.

Support since Server v11.0

Protocol 3.9

Changes include:

JDBC Complex Type Metadata support: Server changed the way it represents some column metadata in v_catalog.types table.

Support since Server v10.1SP1

Protocol 3.8

Changes include:

Native UUID type support: Present the column as type UUID (object ID = 20) instead of type CHAR (object ID = 8).

Support since Server v9.0

Protocol 3.7

Changes include:

Client Backward compatibility (Server’s protocol version is older than the client’s): Format change in the StartupRequest message.
- The existing protocol_version parameter is now sent back to the client in a ParameterStatus message. It contains the protocol version supported by the server.

Support since Server v8.0

The following has been verified by reviewing the startup method in the JDBC driver’s protocol stream in the 9.1 source branch. The exact protocol version each was added is TBD.

Protocol 3.6

Changes include:

Format change in the StartupRequest message. New parameters added:
- mars
- client_os_user_name
New MARS related messages.

Support since Server v7.2

Protocol 3.5

Changes include:

SHA512 authentication support:
- New AuthenticationHashSHA512Password message
- New AuthenticationHashPassword message
Format change in the StartupRequest message. New parameters added:
- protocol_version
- binary_data_protocol
Server backward compatibility (Client’s protocol version is older than the server’s)

Support since Server v7.1

Protocol 3.4 and before

These are functional but buggy versions. Changes include:

Server returns the error code (V) in ErrorResponse and NoticeResponse messages.
Connection Load Balancing support.
User-defined Types support: Format change in the RowDescription message.
Support only one portal. The portal name required in some messages is actually ignored by the backend.
New CommandDescription message.

Note: As of protocol version 3.4, the StartupRequest message includes the following parameters:
user
database
client_pid
client_label
client_type
client_version
client_os

Support protocol 3.4 since Server v7.0