   Copyright (c) 2015, Oracle and/or its affiliates. All rights reserved.

   This program is free software; you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
   the Free Software Foundation; version 2 of the License.

   This program is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
   GNU General Public License for more details.

   You should have received a copy of the GNU General Public License
   along with this program; if not, write to the Free Software Foundation,
   51 Franklin Street, Suite 500, Boston, MA 02110-1335 USA

Rewriter plugin
===============

This is a plugin which accesses the query after parsing and potentially
rewrites it.


Terms and definitions
=====================

- Rewrite rule: the specification of which queries should be rewritten and if
  so, how it should be rewritten. An example of a rewrite rule would be:

    Rewrite all queries like "SELECT * FROM t WHERE c = ?"
    to "SELECT b FROM t  WHERE c < ?"

  A rewrite rule consists of a pattern and a replacement

- Pattern: the part of the rewrite rule that enables us to determine if a
  given query needs to be rewritten or not. The pattern syntax is identical to
  prepared statement syntax.

- Replacement: A new query, also in prepared statement syntax.

- Original query: Query which may get rewritten. This is the query as received
  by the server.

- Rewritten query: Final query once a rule has been applied to an
  original query.

- Literals: SQL literals (character strings, numbers, dates, etc.) Some
  literals may be extracted from the original query and inserted into the
  replacement to form the rewritten query.

- Parameter markers: These are used for two purposes:

  - Wildcards for literals. A parameter marker in the pattern matches any
    literal.

  - References to matched literals. If a parameter marker is also present in
    the replacement, the matched literal will be injected at that
    position. This process will continue with the rest of the matched literals
    until there are no more markers in the pattern. The order is
    left-to-right.

  Syntactically, parameter markers are represented by '?' like in prepared
  statements.

- Plugin user: DBA or anybody else who will be in charge of launching the
  plugin, or changing the rules in the table. This doesn't include users
  simply using the database and getting their queries rewritten.


Usage and things to know
========================

Installation
------------

It is recommended to install the plugin using the supplied SQL script, which
will create a table to hold the rewrite rules, and triggers to keep the plugin
up to date. It will also revoke all privileges on the rules table for anyone
except the root user.

You can then add your rules in the table query_rewrite.rewrite_rules.
The table and schema have the following definitions:

  CREATE DATABASE IF NOT EXISTS query_rewrite;

  CREATE TABLE IF NOT EXISTS query_rewrite.rewrite_rules (
    pattern VARCHAR(10000) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
    replacement VARCHAR(10000) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
    enabled CHAR(1) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL DEFAULT 'Y',
    message VARCHAR(50) CHARACTER SET utf8 COLLATE utf8_bin
  ) DEFAULT CHARSET = utf8 ENGINE = INNODB;

  CREATE TRIGGER query_rewrite.new_rule
  AFTER INSERT ON query_rewrite.rewrite_rules
  FOR EACH ROW
    SET @@global.Rewriter_table_changed = 1;

  CREATE TRIGGER query_rewrite.changed_rule
  AFTER UPDATE ON query_rewrite.rewrite_rules
  FOR EACH ROW
    SET @@global.Rewriter_table_changed = 1;

  CREATE TRIGGER query_rewrite.deleted_rule
  AFTER DELETE ON query_rewrite.rewrite_rules
  FOR EACH ROW
    SET @@global.Rewriter_table_changed = 1;


You can insert a new rewrite rule by doing:

    INSERT INTO query_rewrite.rewrite_rules ( pattern, replacement )
    VALUES ( 'SELECT * FROM t1 WHERE c1=?', 'SELECT * FROM t1 WHERE c2=?' );

You can also install the plugin yourself:

INSTALL PLUGIN rewriter SONAME 'rewriter.so';

but then you also have to create the rules table yourself. And unless you also
add triggers, you will have to set the variable Rewriter_table_changed to ON
each time you touch the table.


Refreshing the plugin
---------------------

The plugin keeps a copy of the rules table in memory in order to enable
quicker matching of rule patterns. When updating the rules table, the updates
are not immediately picked up by the plugin. Instead, there are triggers that
update the variable Rewriter_table_changed when this happens. Any change to
the table will cause the internal table to be re-read. The next query sent to
the server, and the plugin invoked, will cause the in-memory table to be
updated from the database table.


Principle
---------

All original queries will be checked for matches and possibly get rewritten to
the replacement query.

- The matching is done in three stages with increased granularity for
  performance reasons:

  1) Digest match. This is a quick-reject test with a relatively high false
     positive ratio, but without false negatives. As with all digest
     calculations there is a (extremely small) risk of hash collisions. There
     is also a limit on the portion of the query which gets a digest
     calculated. Hence extremely long queries that differ only far into the
     query will always collide. The digest is calculated by performance_schema
     and is not actually not part of the rewrite framework.

  2) Tree structure matching. This makes sure that the original query and
     pattern have the same structure. The check is carried out by comparing
     the normalized query representation. Please refer to the section "21.7
     Performance Schema Statement Digests" in the MySQL manual for details on
     normalized query representation. Practically, queries such as

       SELECT 1 FROM table WHERE name = 'Andrew'
       and
       SELECT 2 FROM table WHERE name = 'Lars'

     will pass this test, since both are normalized to

       SELECT ? FROM table WHERE name = ?.

  3) Literal matching. At this stage it has been established that the parse
     trees of the query and the pattern are equal. All that can differ at this
     point are the literal values.

- If either the pattern or the replacement are incorrect (generates syntax
  errors) SQL queries the rule will simply be disabled (the enabled column
  will be set to 'N') and won't affect the execution of other rules. If this
  happens, the plugin will also write a message in the row's 'message' column.

- If a query is rewritten there will be an SQL note to indicate it.

- It is possible to have a pattern that has more parameters than the
  replacement - in which case the extra ones are just ignored. The opposite -
  more parameters in the replacement - is not allowed and will cause the rule
  not to be loaded into memory. The plugin will let you know this by updating
  a 'message' column in the rules table.

- Databse names are resolved during parsing, and therefore it is recommended
  to always qualify table names in patterns with the database name. It is not
  a requirement but will generally make life a lot easier. Consider the
  following rule:

    INSERT INTO query_rewrite.rewrite_rules ( pattern, replacement )
    VALUES ( 'SELECT * FROM t1', 'SELECT "rewritten"' );

  The name t1 is resolved - and qualified - during parsing, and hence the
  digest will in fact be calculated on SELECT * FROM "database".t1, whatever
  current database was when creating the rule. The plugin uses the user's
  session. This may or may not be what was intended, but qualifying the table
  name is always clearer. It follows that it is not possible to define a
  catch-all rule to rewrite queries against all tables with the same name
  regardless of database.

- When an error occurs on loading a rule the system variable
  Rewriter_data_error will be true and an error message will be written in the
  rule's 'message' column.


Uninstallation
--------------

If you want to clean up your tracks completely, i.e. delete the rules table
and database, it is recommended to use the supplied uninstall script. If you
just to uninstall the plugin (a later installation will pick up the rules
table where you left off, reading it into memory), you may issue:

    UNINSTALL PLUGIN rewriter;


Status variables
----------------

The plugin defines a few status variables:

  - Rewriter_number_rewritten_queries: the number of queries which have been
    rewritten since last installing the plugin.

  - Rewriter_reload_error: ON if an error condition occured when loading the
    rewrite rules table. That rule will have an error message in its 'message'
    column.

  - Rewriter_number_loaded_rules: The number of rewrite rules in the in-memory
    rewrite hash table.

  - Rewriter_number_reloads: The number of times the rules table has been
    loaded into memory.

There is one system variable:

  - Rewriter_table_changed: If ON, the plugin will load the rules table into
    memory.
