Step by Step Guide to Caching in JBoss Data Virtualization 6.1

Vijay Chintalapati bio photo By Vijay Chintalapati Comment
Quick Index
Prerequisites
Result Set Caching
Internal Materialization
External Materialization

Prerequisites

  • Sample Teiid project
  • One or more source models created
  • One or more view models created and the views (virtual tables) that require caching/materialization identified

Result Set Caching

Background

Straight from the documentation the definition of Result Set Caching is:

User query result set caching will cache result sets based on an exact match of the incoming SQL string and PreparedStatement parameter values if present. Caching only applies to SELECT, set query, and stored procedure execution statements; it does not apply to SELECT INTO statements, or INSERT, UPDATE, or DELETE statements.

Key points to remember

  • Result set caching is enabled by default (with a default ttl)
  • The client must explicitly choose to use result set caching by passing : resultSetCacheMode=true in the JDBC URL to the VDB
  • The exact SQL string, including the cache hint if present, must match the cached entry for the results to be reused.
  • Result set caching is not transactional
  • From the web management console, one can choose to clear the result set cache of a specific VDB
  • Result set caching is not limited to memory

Consult the documentation for more in-depth information about result set caching.

Testing Result Set Caching

Step 1: Deploy a sample VDB for testing purposes. Virtual database dvdemo.vdb will be used in this example.

Step 2: Configure the JDBC connection to the VDB from any SQL client. We will use Database Development perspective of JBoss Developer Studio for this purpose.

  1. Right-click on the Database Connections and create a new connection profile
  2. Create a new Teiid connection profile as shown in the image below and click on Next
  3. Fill in the connection details to the VDB. Ensure to populate the URL property resultSetCacheMode=true. We will be using the teiidUser login for this purpose and click on Finish
  4. Explore the Teiid connection just established and open an new SQL scrapbook to run queries

Step 3: Run queries against the VDB without any customization (such as cache hints)

  1. Try a plain, simple/sample query without any customization and review the status to see how long it took to run the query

    Note in the image below, the status says that query took 1 sec and 23 ms
  2. Re-execute the same query and check on how long it took to run the same query with default caching enabled. Based on the image below we saved 14 ms. The time saved widens as the result sets gets larger and larger.

Step 4: Run queries against the VDB with cache hints

The format of the cache hint is:

/*+ cache[([pref_mem] [ttl:n] [updatable])] [scope:(session|user|vdb)] */ select * from sample 

The official documentation is more effective in explaining cache hints and options. The following images show the usage of cache hints and options.

  1. Using Cache Hint
  2. Overriding the use of cache results by appending OPTION NOCACHE to the end of the SQL query. Pay attention to the run time of the query to realize the cached results are not used

Internal Materialization

Internal materialization is caching of all records of a specific view (virtual table). Enabling Internal Materialization on a specific view is done in the Teiid Designer and before the view is packed into a VDB, as shown in the image below. Focus on the field the Yellow tool-tip is pointing to; the Materialized property of the view is a drop-down with two options true/false. Enabling Internal Materialization is as simple as setting this property to true.

Key points about Internal Materialization

  1. If the tranformation query behind the view has no cache hints, then the results are cached for the lifetime of the JVM
  2. One can assign a cache hint to the transformation query behind the view to enable cache expiry after a defined ttl (time to live) as shown below. Note : If a new query arrives after the cache contents have expired, an asynchronous job refreshes the cache, till that job is complete the query will not be blocked and stale entries will be served
  3. One can invalidate the cache on demand using one of the two ways below. Invalidation will block the queries until the cache is refreshed.
    • Using the web based admin console of the DV runtime server. Focus on all the selected/highlighted entries in the image
    • Connecting to the deployed VDB via a SQL client and executing the command of the following form
  EXEC SYSADMIN.refreshMatView(viewname=>'Customer.customers', invalidate=>true)
  

Testing Internal Materialization

Testing internal materialization is only possible after bundling the materialized into a VDB and deploying it. Once deployed, just as shown above with any SQL client, any kind of SELECT query (with whatever WHERE clause conditions) against the view will fully populate the cache.

Subsequent queries then take advantage of the cached records.


External Materialization

External materialization in JBoss Data Virtualization is the process of persisting all the records of a materialized view to a physical/source model instead of storing the records in memory or being managed by Teiid BufferManager

The article that comes closest to guiding thru the steps pictorially can be found here. In that article the external materialization is made in to a traditional relational source such as MySQL.

We will challenge ourselves here and externally materialize a view to MongoDB.

Setup a MongoDB Source Model

Step 1 : In the current project, right click on the sources folder and choose Import

Step 2 : Choose Teiid Connection » Source Model in the options provided as shown below and click Next

Step 3 : Click on New in the window that is presented as shown below

Step 4 : Populate the connection details as shown and click on Apply and then click on Ok

After a successful connection the following window is displayed. Click Next on this screen and next to get to the screen shown in the subsequent image

Step 5 : Provide the Source Model name as shown in the image below and click Next

Step 6 : Click Next on the screen shown

Step 7 : Choose one of the existing tables and click on Finish just to proceed further. Imported tables will not be used, so it does not matter which one is chosen.

Step 8 : [OPTIONAL] In the newly created MongoDB source model delete the imported table by right-clicking on the table and choosing Delete. Save the source model when done.

Create Materialized Views

Step 1 : Right click on the view to externally materialize and choose Create Materialized Views as shown below:

Step 2 : In the next dialog, choose the source model for the materialized view, as shown below, and click on Finish:

Step 3 : Open the source model when the materialized view is created, one should find a schema as shown below. Double-click on it to get to view the materialized table created.

Step 4 : One would usually keep the schema in place but we will move the table created under the schema to root of source model with cut and paste as shown below:

Step 5 : Delete the lingering empty by right-clicking on schema name and choosing Delete

Step 6 : Clone the materialized customers table and call it staging_customers

Step 7 : Set the Name in Source property of tables to the table names respectively

That is, customers would be set to customers and staging_customers would be set to staging_customers.

Step 8 : Final step, correct the Materialized Table property value of the customers as shown below:

The property value would be set to:
customers (Path=/dvdemo/sources/Local_MongoDB.xmi)

Testing External Materialization

Step 1 : Package a dvdemo.vdb as shown below and ensure that each source model is bound to appropriate JNDI name as shown below. Also, uncheck the visibility of the source models as shown below.

Step 2 : Deploy the VDB, connect to it view Database Development perspective and execute shown queries:

DELETE FROM Local_MongoDB.customers;
INSERT INTO Local_MongoDB.customers SELECT * FROM Customer.customers OPTION NOCACHE;
SELECT * FROM Customer.customers

Note the run time of the query and check the server logs. Notice that the runtime is just few milli-seconds shorter than querying the view and the logs show that the results are being fetched from MongoDB instead of both the other sources.

Note: For this to accurately work one of the datasources, Local_MySQL or Local_Postgres, had to be made an XA datasource. For some reason reading from the view and writing to Local_MongoDB.customers is treated as a XA transaction at least between MySQL and Postgres

Use of staging table and scheduling periodic refresh of the materialized table

Step 1 : Get the necessary MongoDB jar files and drop them in $JDV_HOME/dataVirtualization/teiid-adminshell/lib

  1. mongodb-driver-3.0.2.jar
  2. mongodb-driver-core-3.0.2.jar
  3. bson-3.0.2.jar

Step 2 : Create a groovy script load_mongo.groovy with the following content

sql=connect("jdbc:teiid:dvdemo@mm://localhost:31000;resultSetCacheMode=true","teiidUser","teiidUser123!");
sql.execute("DELETE FROM Local_MongoDB.staging_customers");
sql.execute("INSERT INTO Local_MongoDB.staging_customers SELECT * FROM Customer.customers OPTION NOCACHE");
sql.close();

import com.mongodb.*
mongoClient = new MongoClient()
collection = mongoClient.getDB("test").getCollection("staging_customers")
collection.rename("customers",true)

Note: Ensure understanding of the repercussions of renaming a collection by reading up on the MongoDB documentation

Step 3 : Test the groovy script (placed in the same folder as teiid-adminshell) by running the following command

./adminshell.sh . load_mongo.groovy

Step 4 : On Linux/Unix machines where crontab works, create a file by the name crontab (it can be any name)

The contents of the file used in this example is shown below:

Execute the following command once the file is created and in the same folder as the file:

crontab crontab

The job will now run every minute as scheduled in the crontab file.

comments powered by Disqus