Appendix¶
Create SQL Server Extended Events Session¶
Create Extended Events Session¶
-- Query to create an extended events session
CREATE EVENT SESSION [alation_query_log] ON SERVER
ADD EVENT sqlserver.sp_statement_completed
(
ACTION
(
package0.collect_system_time,
package0.event_sequence,
sqlos.task_time,
sqlserver.client_app_name,
sqlserver.database_id,
sqlserver.database_name,
sqlserver.nt_username,
sqlserver.server_principal_name,
sqlserver.server_principal_sid,
sqlserver.session_id,
sqlserver.session_nt_username,
sqlserver.transaction_id,
sqlserver.username
)
WHERE
-- Lineage and context queries:
(
[sqlserver].[is_system] = 0
-- add other databases you want to exclude e.g. test/dev databases, etc.
-- AND [sqlserver].[database_name] <> 'master' -- uncomment this if you do not want queries from master database
AND [sqlserver].[database_name] <> 'model'
AND [sqlserver].[database_name] <> 'msdb'
AND [sqlserver].[database_name] <> 'tempdb'
-- Filter for specific databases, Provide appropriate database names instead of placeholder names DB1, DB2, DB3, etc.
-- AND ( [sqlserver].[database_name] = N'DB1' OR [sqlserver].[database_name] = N'DB2' )
-- Or alternatively, use database_id. Use database_id for high performance
-- AND ( [sqlserver].[database_id] = <integer1> OR [sqlserver].[database_id] = <integer2> )
-- Add other users you want to exclude e.g. monitoring software users, sa etc.
AND [sqlserver].[username] <> N'NT SERVICE\SQLTELEMETRY'
AND [sqlserver].[username] <> 'sys'
AND [sqlserver].[username] <> 'INFORMATION_SCHEMA'
-- Filter for specific databases, add more OR conditions for users if required
-- AND [sqlserver].[username] = N'<user>'
-- Add any other apps you want to exclude; SSMS queries are not excluded by the filters below
AND [client_app_name] <> 'Microsoft SQL Server Management Studio - Transact-SQL IntelliSense'
AND [client_app_name] <> 'Microsoft SQL Server Management Studio'
-- stubborn statements (add more if needed)
AND [statement] NOT LIKE 'SELECT StatMan%'
AND
(
[statement] like 'ALTER%' OR
[statement] like 'CREATE%' OR
[statement] like 'DROP%' OR
[statement] like 'TRUNCATE%' OR
[statement] like 'MERGE%' OR
[statement] like 'SELECT%INTO%' OR
[statement] like 'INSERT%INTO%FROM %' OR
[statement] like 'UPDATE%FROM%' OR
[statement] like 'USE%' OR
[statement] LIKE 'SELECT%FROM%' OR
[statement] like 'EXEC%' OR
[statement] like '%ALTER%' OR
[statement] like '%CREATE%' OR
[statement] like '%DROP%' OR
[statement] like '%TRUNCATE%' OR
[statement] like '%MERGE%' OR
-- INSERT statements can be of type "SELECT INTO ..." or "INSERT INTO ..."
[statement] like '%SELECT%INTO%' OR
[statement] like '%INSERT%INTO%FROM%' OR
[statement] like '%UPDATE%FROM %' OR
[statement] like '%USE%' OR
[statement] like '%EXEC%' OR
[statement] LIKE '%SELECT%FROM%' OR
[statement] like '%FROM%'
)
-- If the files are filling quickly (i.e. too many events)
-- un-comment next line to only write 1 out of every 20 logs to the .xel files
-- However one could potentially miss lineage and important queries
-- All events still get processed; this setting only
-- controls what gets logged to the .xel files
-- If you want 25% of queries written change the integer to "4", etc.
-- AND package0.divides_by_uint64(package0.counter, 20)
)
),
ADD EVENT sqlserver.sql_statement_completed
(
SET collect_statement=(1)
ACTION
(
package0.collect_system_time,
package0.event_sequence,
sqlos.task_time,
sqlserver.client_app_name,
sqlserver.database_id,
sqlserver.database_name,
sqlserver.nt_username,
sqlserver.server_instance_name,
sqlserver.server_principal_name,
sqlserver.session_id,
sqlserver.session_nt_username,
sqlserver.transaction_id,
sqlserver.username
)
-- Generic Alation filters, we only make use of certain types of statements
-- Note SELECT is omitted because SELECT without FROM is not useful for our analysis
WHERE
-- Lineage and context queries:
(
[sqlserver].[is_system] = 0
-- add other databases you want to exclude e.g. test/dev databases, etc.
-- AND [sqlserver].[database_name] <> 'master' -- uncomment this if you do not want queries from master database
AND [sqlserver].[database_name] <> 'model'
AND [sqlserver].[database_name] <> 'msdb'
AND [sqlserver].[database_name] <> 'tempdb'
-- Filter for specific databases, Provide appropriate database names instead of placeholder names DB1, DB2, DB3, etc.
-- AND ( [sqlserver].[database_name] = N'DB1' OR [sqlserver].[database_name] = N'DB2' )
-- Or alternatively, use database_id. Use database_id for high performance
-- AND ( [sqlserver].[database_id] = <integer1> OR [sqlserver].[database_id] = <integer2> )
-- Add other users you want to exclude e.g. monitoring software users, sa etc.
AND [sqlserver].[username] <> N'NT SERVICE\SQLTELEMETRY'
AND [sqlserver].[username] <> 'sys'
AND [sqlserver].[username] <> 'INFORMATION_SCHEMA'
-- Filter for specific databases, add more OR conditions for users if required
-- AND [sqlserver].[username] = N'<user>'
-- Add any other apps you want to exclude; SSMS queries are not excluded by the filters below
AND [client_app_name] <> 'Microsoft SQL Server Management Studio - Transact-SQL IntelliSense'
AND [client_app_name] <> 'Microsoft SQL Server Management Studio'
-- statements that show up but you don't want to log(add more if needed)
AND [statement] NOT LIKE 'SELECT host_platform FROM sys.dm_os_host_info'
AND [statement] NOT LIKE 'SELECT dtb.name AS [Name], dtb.state AS [State] FROM master.sys.databases dtb'
AND
(
[statement] like 'ALTER%' OR
[statement] like 'CREATE%' OR
[statement] like 'DROP%' OR
[statement] like 'TRUNCATE%' OR
[statement] like 'MERGE%' OR
[statement] like 'SELECT%INTO%' OR
[statement] like 'INSERT%INTO%FROM %' OR
[statement] like 'UPDATE%FROM%' OR
[statement] like 'USE%' OR
[statement] like 'EXEC%' OR
[statement] LIKE 'SELECT%FROM%' OR
[statement] like '%ALTER%' OR
[statement] like '%CREATE%' OR
[statement] like '%DROP%' OR
[statement] like '%TRUNCATE%' OR
[statement] like '%MERGE%' OR
-- INSERT statements can be of type "SELECT INTO ..." or "INSERT INTO ..."
[statement] like '%SELECT%INTO%' OR
[statement] like '%INSERT%INTO%FROM%' OR
[statement] like '%UPDATE%FROM %' OR
[statement] like '%USE%' OR
[statement] like '%EXEC%' OR
[statement] LIKE '%SELECT%FROM%' OR
[statement] like '%FROM%'
)
-- If the files are filling quickly (i.e. too many events)
-- un-comment next line to only write 1 out of every 20 logs to the .xel files
-- However one could potentially miss lineage and important queries
-- All events still get processed; this setting only
-- controls what gets logged to the .xel files
-- If you want 25% of queries written change the integer to "4", etc.
-- AND package0.divides_by_uint64(package0.counter, 20)
)
)
ADD TARGET package0.event_file
(
-- CONFIGURE THIS:
SET filename=N'C:\Users\Public\Documents\alation_query_log.xel',
-- Note: Alation will read one file at a time so this is the size of the file that may be read into memory while it is being fetched.
-- max file size in MB before rolling over:
max_file_size=(100), -- in MB
max_rollover_files=(100)
)
WITH
(
-- OPTIONALLY CONFIGURE THIS. The total event buffer size:
MAX_MEMORY=50 MB,
-- If buffer is full, events will be dropped instead of blocking the server:
EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,
-- OPTIONALLY CONFIGURE THIS. Max time before writing events to storage:
MAX_DISPATCH_LATENCY=30 SECONDS,
-- Any events that are too large to fit in the buffer will be dropped:
MAX_EVENT_SIZE=0 KB,
MEMORY_PARTITION_MODE=NONE,
TRACK_CAUSALITY=OFF,
STARTUP_STATE=ON
);
GO
-- Query to start the session (change START -> STOP to stop it)
ALTER EVENT SESSION [alation_query_log] ON SERVER
STATE = START;
GO
Note
filename
- Specify the xel file name comprising of absolute path and the file prefix. For example, C:\Users\Public\Documents\alation_query_log.xel. In this example, C:\Users\Public\Documents is the path and alation_query_log is the prefix.max_file_size
- Specify the maximum file size. Once the file size exceeds the limit, the SQL Server creates a new file. Alation recommends a maximum file size of 100 MB before rolling over.max_rollover_files
- Specify the maximum number of rollover xel files to create when file size exceeds the maximum file size specified (max_file_size
). Once the limit exceeds, the SQL Server removes the oldest file and creates a new one. Alation recommends a maximum of 100 rollover files.MAX_MEMORY
- Specify the maximum event buffer memory size. Once buffer memory is full, the SQL Server writes events to the file. Alation recommends a maximum buffer memory size of 50 MB the file IO operation.MAX_DISPATCH_LATENCY
- Specify the maximum time in seconds after which SQL Server writes all events in the memory buffer to the file. Alation recommends a maximum of 30 seconds.
Note
The SQL Server performs the file write operation when either the MAX_DISPATCH_LATENCY
or MAX_MEMORY
limit is reached.
Delete and Recreate the Session¶
If you want to recreate the session, you need to first drop the existing session.
Use the following query to delete a session:
DROP EVENT SESSION [alation_query_log] ON SERVER;
GO
-- Check if the session is dropping events and see other data about the session
Using Database ID instead of Database Name¶
If you wish to limit sampling to specific databases, you can either provide database_name
or database_id
.
To get the database id:
SELECT DB_ID ('database name')
This command can be run in SSMS or any query tool.
Provide the specific database id’s in place of <integer> in the appropriate sections of the query. Comments explain where you can alternatively use database_name
or database_id
:
AND [sqlserver].[database_id]=<integer>
Filter for Users and Databases¶
You can filter the usernames and databases you want to exclude to get queries. Any queries against the master or queries written by OLTP processes or monitoring processes can be filtered.
To filter usernames:
[sqlserver].[user_name]
To filter databases:
[sqlserver].[database_name]
Note
The default sampling size is 1 out of 20 queries. If too many queries are extracted, setting this higher (1 out of 30 queries) along with the filters mentioned above will reduce the volume of queries extracted. If this is a low-traffic server, you must set this to 1 out of 5 or even comment them out so you get every query (this is not recommended if the filters mentioned above are not used).