F.55. pg_throttle#
F.55. pg_throttle #
pg_throttle
- is a Tantor SE extension which add
query throttling feature.
F.55.1. Functionality #
Extension allows to set upper limit to amount of tuples sent from
Scan nodes to upper nodes. I.e. SeqScan
node
can give no more than 100 tuples per second to the upper nodes.
Throttling is set as number of tuples returned by scan node per second. Supported ScanNodes are:
SeqScan
IndexScan
/IndexOnlyScan
BitmapHeapScan
/BitmapIndexScan
CustomScan
ForeignScan
ResultScan
/ValuesScan
FunctionScan
Also, there is support for other DML statements, which can modify
table. I.e. INSERT
, UPDATE
,
DELETE
, but only if
RETURNING
clause specified.
There are 2 work-modes for throttling:
per-node - throttling is setup per node independently
greater speed (in comparison to query mode)
finer control granularity
query - throttling state is shared between all nodes
more strict throttling of query execution
Throttle speed is measured as number of tuples returned by corresponding node in 1 second. Extension only tracks amount of tuples and if it exceeds the limit (for less than 1 second) it just sleeps rest time. For example, throttling time is 100 (100 tuples per second):
0.1s - 35 tuples returned
0.2s - 35 tuples returned
0.3s - 30 tuples returned
0.4-1s - sleep
F.55.2. Setting up #
Extension must be loaded using
session_preload_libraries
. During loading it
looks up it’s configuration table (see below) and setup hooks if
necessary. Add extension to postgresql.conf
# postgresql.conf session_preload_libraries = 'pg_throttle'
Last, run DB and create extension:
CREATE EXTENSION pg_throttle;
Note
Custom schema for extension can be specified
F.55.3. Getting started #
Install extension as in discussed in setting up section.
After that, you may want to throttle yourself to test extension. Run this:
CREATE TABLE tbl( value int ); INSERT INTO tbl SELECT * FROM generate_series(1, 200); SELECT throttle_user_per_node_all(current_user, 100);
Now, this user (which you use to login) will be throttled. Exit, login again and run this query:
SELECT * FROM tbl;
Duration of the query should be about 2 seconds: 200 tuples / 100 tuples per second = 2 seconds of delay.
F.55.4. API & examples #
FUNCTION throttle_user_per_node(username text, seq_scan_max int idx_scan_max int, idx_only_scan_max int bitmap_heap_scan_max int, bitmap_index_scan_max int custom_scan_max int, foreign_scan_max int tid_range_scan_max int, modify_scan_max int, values_scan_max int, result_scan_max int, func_scan_max int);
This is main function to set user throttling. But, usually you will use other short-hand functions below.
username
- name of user who should be
throttled. If specified user does not exists it will throw
exception.
XXX_max
- other arguments specify throttling
per corresponding node:
seq_scan_max
- SeqScanidx_scan_max
- Index Scanidx_only_scan_max
- Index Only Scanbitmap_heap_scan_max
- Bitmap Heap Scanbitmap_index_scan_max
- Bitmap Index Scancustom_scan_max
- Custom Scanforeign_scan_max
- Foreign Scantid_range_scan_max
- TID Range Scanmodify_scan_max
- DML modification statements(INSERT, UPDATE, etc…), only forRETURNING
values_scan_max
- Values Scanresult_scan_max
- Result Scanfunc_scan_max
- Function Scan
For XXX_max
arguments allowed value:
NULL
- node will not be throttled. Forquery_max = NULL
- enables per node mode0 < XXX_max
- setsXXX_max
tuples per second processing per second for corresponding node
Note
Negative or 0 is not allowed value. If you want to disable
throttling - set it NULL
Example - throttle ‘analytics’ user.
SELECT throttle_user_per_node('analytics', 100, -- SeqScan 1000, 1000, -- Index/IndexOnly Scan 1000, 1000, -- Bitmap Heap/Index Scan NULL, -- Custom Scan (disable throttling) 10, -- Foreign Scan 100, -- TID Range Scan NULL, -- DML (disable throttling) NULL, NULL, -- Values/Result Scan (disable throttling) 1000); -- Function Scan
FUNCTION throttle_user_per_node_all(username text, max int);
This is short-hand function for
throttle_user_per_node
which sets all
XXX_max
to max
.
Example:
SELECT throttle_user_per_node_all('analytics', 100);
FUNCTION throttle_user_query(username text, max int);
This function throttle all Scan nodes using shared state. That means that all nodes has same throttling counter. Counter applies to all supported nodes - you can not set specific nodes
Example:
SELECT throttle_user_query('analytics', 100);
FUNCTION unthrottle_user(username text);
This function deletes throttling from specified user.
Example:
SELECT unthrottle_user('analytics');
TABLE throttlings( username text, query_throttle int, XXX_throttle int );
This is a main configuration table of extension - stores throttling configuration for users:
username
- name of throttled userquery_throttle
- throttle for query mode (NULL
if per-node mode)XXX_throttle
- is a throttle value for corresponding node
This table is read by extension at startup of backend, during extension initialization. So user should have access to it, but it creates security risks. To address this issue see security section below.
F.55.5. Implementation details #
Extension is implemented using decorators.
When ExecutorStart_hook
fires we search it’s
PlanState
for interesting Nodes (which are
supported). If such node is found we:
Create our Throttling node
Copy contents of old node
Save
ExecProcNode
of old node (and other staff)Replace
ExecProcNode
inPlanState
(it will be invoked by executor)
Then during execution:
Executor calls
PlanState->ExecProcNode
Our throttling function is invoked
Call saved
ExecProcNode
- get tupleHandle throttling logic
Return tuple
This applies to every supported node, except
CustomScan
and ForeignScan
.
We can not do the same for them, because extension create their
own structures above them (also decorator). So doing this again -
we overwrite their data. To overcome this we create such decorator
for it’s interface -
CustomScanMethods
and
FdwRoutine
.
F.55.6. Caveats #
There are many caveats that you must be aware of
F.55.6.1. Decorators #
Extension uses decorators to implement this functionality. If any other extension doing so, one day they can overwrite their data which leads to unpredictable behaviour.
If you find this, 1) please tell us (to warn other users) and 2) you must choose only one extension.
F.55.6.2. Parallel workers #
Parallel workers are handled in same way - they also use
Scan
nodes and throttling applies to them.
But they have separate memory addresses, so query mode will not
work for them.
F.55.6.3. Sawtooth wave #
Extension does not distributes the load evenly - it just sleeps when threshold was achieved. Because of that you may see sawtooth graphs.
F.55.6.4. Applying changes #
Extension caches throttling configuration in static variables. It reads them on extension initialization and later use them.
So, if you changed any other throttling parameter it will be used only on next session. That user may need to reconnect.
F.55.6.5. User to whom it is applied #
Throttling logic is applied to SESSION_USER
.
If you change it i.e. using SET ROLE
throttling logic will apply to SESSION_USER
with it’s startup configuration.
F.55.6.6. Security #
Configuration table throttlings
is read
during session startup (extension initialization) to get session
user throttling parameters. Granting SELECT
access to all users seems security leak, so there is builtin RLS
enabled for table. There is single rule -
CURRENT_USER = username
, so users read only
throttlings for themselves.
But at startup, default users do not have access to that table,
and all queries will result in permission denied error. To fix
this, grant SELECT
to
public
(or whatever role you want).
REVOKE ALL ON TABLE throttlings FROM public; GRANT SELECT ON TABLE throttlings TO public;
Now, only superusers can modify throttlings
configuration table;
Note
throttlings
already has RLS
POLICY such session user can work only with it’s
own records.
F.55.6.7. Dead tuples #
Extension handles only amount of tuples returned by node, but does not handles total amount of page/buffer access. This leads to situations when total amount of touched pages/buffers significantly exceeds imposed throttling.
Cases when this can fire:
Scan key (filter) - if there is filter in node, only tuples satisfied by filter returned - there may be hundreds of tuples have been processed before return any from this node.
Dead tuples - dead tuples are also read from pages, but they will not return from node - hundreds of dead tuples in table may affect the performance
F.55.6.8. DML throttling #
Extension throttles amounted of returned
tuples, but INSERT
,
UPDATE
, DELETE
and other
table modification statements returns only 1 tuple. So
throttling is not applied to them.
But there are cases when do:
These statements has
RETURNING
- in this case logic is applied correctly because this node returns multiple queriesInput for these nodes given from other supported Scan nodes - examples of this includes:
DELETE FROM tbl WHERE value > 1
- input forDELETE
isSeqScan
(throttled) from same tableINSERT INTO tbl SELECT * FROM tbl2
- input forINSERT
isSeqScan
(throttled) from another tableINSERT INTO tbl SELECT * FROM generate_series(1, 100)
- input forINSERT
isFunctionScan
(throttled)UPDATE tbl SET value = value + 1
- input forUPDATE
isSeqScan
(throttled) from same table