Hybrid Partitioned Tables in Oracle Database 19c

With Oracle Database 19c a new concept called Hybrid Partitioned Tables has been introduced. As the name implies, the partitions can exist both inside the Oracle database in tablespaces and outside in external sources such as Linux files, files on Hadoop Distributed File System (HDFS), and files in Cloud Object Storage. Hybrid Partitioned Tables enhance the functionality of partitioning for mixed Big Data scenarios where large portions of a table can reside in external partitions.

Using this feature you may like to move older non-active table partitions to cheaper storage options outside the database – while the active data is hosted in partitions stored inside the Oracle database. The Hybrid Partition Tables feature enables the integration of internal and external partitions – all in the same table.

The Hybrid Partitioned Tables feature extends Oracle partitioning by enabling partitions to reside in both – Oracle database segments and external files and sources. To accomplish this, the two concepts – external table functionality and partitioning – have been integrated. Hybrid Partitioned Tables support all existing external table driver types for external partitions including ORACLE_ DATAPUMP, ORACLE_LOADER, ORACLE_HDFS and ORACLE_ HIVE.

ORACLE_LOADER is the default external tables access driver to read data from flat files, The ORACLE_DATAPUMP access driver however can write data to a dump file in a binary format that can only be read by the ORACLE_DATAPUMP access driver. With ORACLE_HDFS and ORACLE_HIVE, data stored in HDFS and Hive tables on Hadoop and Hive clients can be accessed as if that data was stored in tables in an Oracle database. As a prerequisite you need access to a directory object in the database to access the external sources.

Let’s review this feature and understand how to combine internal and external partitions into a single Hybrid Partitioned Table in Oracle Database 19C.

Hybrid partitioned tables support the following external table types for external partitions.

  • ORACLE_DATAPUMP
  • ORACLE_LOADER
  • ORACLE_HDFS
  • ORACLE_HIVE
  • Restrictions

The ORACLE_LOADER and ORACLE_DATAPUMP examples:

  • Create Hybrid Partitioned Table (ORACLE_LOADED)
  • Create Hybrid Partitioned Table (ORACLE_DATAPUMP)
  • DML operations on the External Partitions
  • Adding and Dropping Partitions

Creating a Hybrid Partitioned Table (ORACLE_LOADER)

Create some flat files with dummy data to use with the external table partitions.

SET MARKUP CSV ON QUOTE ON
SET TRIMSPOOL ON LINESIZE 1000 FEEDBACK OFF PAGESIZE 0
SPOOL /tmp/gbr1.txt
SELECT 'GBR', object_id, owner, object_name FROM all_objects
WHERE object_id <= 3999 AND rownum <= 1000;
SPOOL OFF;

SPOOL /tmp/gbr2.txt
SELECT 'GBR', object_id, owner, object_name FROM all_objects
WHERE object_id BETWEEN 4000 AND 5999 AND rownum <= 1000;
SPOOL OFF;

SPOOL /tmp/ire1.txt
SELECT 'IRE', object_id, owner, object_name FROM all_objects
WHERE object_id <= 3999 AND rownum <= 1000;
SPOOL OFF;

SPOOL /tmp/ire2.txt
SELECT 'IRE', object_id, owner, object_name FROM all_objects
WHERE object_id BETWEEN 4000 AND 5999 AND rownum <= 1000;
SPOOL OFF;
SET MARKUP CSV OFF
SET TRIMSPOOL ON LINESIZE 1000 FEEDBACK ON PAGESIZE 14

The following example creates a hybrid partitioned table. It uses list partitioning, with a single internal partition for USA data, and two external partitions for GBR and IRE data respectively.

--DROP TABLE TEST_OL_HYBRID_PART_TAB PURGE;
CREATE TABLE TEST_OL_HYBRID_PART_TAB ( country_code VARCHAR2(3) NOT NULL,
object_id NUMBER NOT NULL,
owner VARCHAR2(128) NOT NULL,
object_name VARCHAR2(128) NOT NULL )
EXTERNAL PARTITION ATTRIBUTES(
TYPE ORACLE_LOADER DEFAULT DIRECTORY tmp_dir1
ACCESS PARAMETERS (
FIELDS CSV WITH EMBEDDED TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
MISSING FIELD VALUES ARE NULL
(country_code, object_id, owner, object_name) )
REJECT LIMIT UNLIMITED )
PARTITION BY LIST (country_code)
( PARTITION usa VALUES ('USA'), PARTITION gbr VALUES ('GBR')
EXTERNAL LOCATION ('gbr1.txt', 'gbr2.txt'),
PARTITION ire VALUES ('IRE')
EXTERNAL DEFAULT DIRECTORY tmp_dir2 LOCATION ('ire1.txt', 'ire2.txt') );

The EXTERNAL PARTITION ATTRIBUTES clause defines the table-level external table parameters. For external partitions, the EXTERNAL clause is used to define partition-level attributes, such as file locations and non-default directory objects.

The HYBRID column in the CDB|DBA|ALL|USER}_TABLES view show this is a hybrid table.

COLUMN table_name FORMAT A30
COLUMN hybrid FORMAT A6
SELECT table_name, hybrid FROM user_tables;

TABLE_NAME                     HYBRID
------------------------------ ------
TEST_OL_HYBRID_PART_TAB        YES

Currently, there is only data in the external partitions.

COLUMN country_code FORMAT A12
SELECT country_code, COUNT(*) as amount FROM   TEST_OL_HYBRID_PART_TAB
GROUP BY country_code ORDER BY country_code;
COUNTRY_CODE     AMOUNT
------------ ----------
GBR                2000
IRE                2000

DML commands perform on the internal partition as normal.

INSERT INTO TEST_OL_HYBRID_PART_TAB
SELECT 'USA', object_id, owner,  object_name
FROM   all_objects wHERE  rownum <= 2000;
COMMIT;

COLUMN country_code FORMAT A12
SELECT country_code, COUNT(*) as amount FROM TEST_OL_HYBRID_PART_TAB
GROUP BY country_code ORDER BY country_code;

COUNTRY_CODE     AMOUNT
------------ ----------
GBR                2000
IRE                2000
USA                2000

As you might expect, trying to insert data into the external partitions results in an error.

Note: Be advised that you cannot perform DML operations on the external partitions. Data manipulation can be done on the internal partitions whose data resides in database segments. Attempting to do DML operation on the external partition will fail and generate the following error.

INSERT INTO TEST_OL_HYBRID_PART_TAB VALUES ('GBR', 9999, 'X', 'X');
*
ERROR at line 1:
ORA-14466: Data in a read-only partition or subpartition cannot be modified.

SQL> delete from TEST_OL_HYBRID_PART_TAB  where object_id <= 2000;
delete from TEST_OL_HYBRID_PART_TAB  where object_id <= 2000
*
ERROR at line 1:
ORA-14466: Data in a read-only partition or subpartition cannot be modified.

SQL> update TEST_OL_HYBRID_PART_TAB set object_id=436 WHERE object_id=4363;
update TEST_OL_HYBRID_PART_TAB set object_id=436 WHERE object_id=4363
*
ERROR at line 1:
ORA-14466: Data in a read-only partition or subpartition cannot be modified.

SQL>

Creating a Hybrid Partitioned Table (ORACLE_DATAPUMP)

In this example we created a dump file called “bgr_xt.dmp” containing BGR data. Creating the external table generates the Data Pump dump file. When we drop the table the dump file remains on the database file system, so we can use it for our test.

CREATE TABLE bgr_xt
ORGANIZATION EXTERNAL (TYPE ORACLE_DATAPUMP DEFAULT DIRECTORY tmp_dir1
LOCATION ('bgr_xt.dmp')  ) AS  SELECT 'BGR' AS country_code,
object_id,owner, object_name FROM   all_objects where rownum <= 2000;

DROP TABLE bgr_xt;

The following example creates a hybrid partitioned table. It uses list partitioning, with a single internal partition for USA data, and a data pump external partition for BGR data.

--DROP TABLE test_dp_hybrid_part_tab PURGE;

CREATE TABLE test_dp_hybrid_part_tab (
country_code  VARCHAR2(3)   NOT NULL,
object_id     NUMBER        NOT NULL,
owner         VARCHAR2(128) NOT NULL,
object_name   VARCHAR2(128) NOT NULL
)  EXTERNAL PARTITION ATTRIBUTES (
TYPE ORACLE_DATAPUMP DEFAULT DIRECTORY tmp_dir1)
PARTITION BY LIST (country_code) (
PARTITION usa VALUES ('USA'),
PARTITION bgr VALUES ('BGR') EXTERNAL
LOCATION ('bgr_xt.dmp'));

The EXTERNAL PARTITION ATTRIBUTES clause defines the table-level Data Pump external table parameters. For external partitions, the EXTERNAL clause is used to define partition-level attributes, such as file locations and non-default directory objects.

The HYBRID column in the CDB|DBA|ALL|USER}_TABLES view show this is a hybrid table.

COLUMN table_name FORMAT A30
COLUMN hybrid FORMAT A6

SQL> SELECT table_name, hybrid FROM user_tables;

TEST_OL_HYBRID_PART_TAB        YES
TEST_DP_HYBRID_PART_TAB        YES

Currently, there is only data in the external partitions.

COLUMN country_code FORMAT A12

SELECT country_code, COUNT(*) as amount FROM   test_dp_hybrid_part_tab
GROUP BY country_code ORDER BY country_code;

COUNTRY_CODE     AMOUNT
------------ ----------
BGR                2000

SQL>

DML Commands perform on the internal partition as normal.

INSERT INTO test_dp_hybrid_part_tab
SELECT 'USA', object_id, owner, object_name FROM   all_objects
WHERE  rownum <= 2000;
COMMIT;

COLUMN country_code FORMAT A12

SELECT country_code, COUNT(*) as amount FROM test_dp_hybrid_part_tab 
GROUP BY country_code ORDER BY country_code;
COUNTRY_CODE     AMOUNT
------------ ----------
BGR                2000
USA                2000

Note: Please be advised that you cannot perform DML operations on the external partitions. Data Manipulation can only be done on the internal partitions whose data resides in database segments. Attempting to do DML operation on the external partition will fail and generate the below error.

You can read data from the external table partition and insert (also update or delete) data on the internal partition. And remember, you can only use DML operations on the internal partitions of a hybrid partitioned table because external partitions are treated as read-only partitions. There are no restrictions on standard partition operations such as adding and dropping partitions.

mINSERT INTO test_dp_hybrid_part_tab VALUES ('GBR', 9999, 'X', 'X');            
*ERROR at line 1:
ORA-14466: Data in a read-only partition or subpartition cannot be modified.

Adding and Dropping Partitions

There are no restrictions on standard partition operations such as adding and dropping partitions. In the below example, the partition for the IRE data from the TEST_OL_HYBRID_PART_TAB table has been dropped.

ALTER TABLE TEST_OL_HYBRID_PART_TAB DROP PARTITION ire;

COLUMN country_code FORMAT A12

SELECT country_code, COUNT(*) as amount FROM  TEST_OL_HYBRID_PART_TAB
GROUP BY country_code ORDER BY country_code;

COUNTRY_CODE     AMOUNT
------------ ----------
GBR                2000
USA                2000

SQL>

The IRE data partition can also be added back.

ALTER TABLE TEST_OL_HYBRID_PART_TAB ADD PARTITION ire VALUES ('IRE') EXTERNAL
DEFAULT DIRECTORY tmp_dir2 LOCATION ('ire1.txt', 'ire2.txt');

COLUMN country_code FORMAT A12

SELECT country_code, COUNT(*) as amount
FROM   TEST_OL_HYBRID_PART_TAB GROUP BY country_code 
ORDER BY country_code;

COUNTRY_CODE     AMOUNT
------------ ----------
GBR                2000
IRE                2000
USA                2000

In this example, the BGR data from the TEST_DP_HYBRID_PART_TAB table has been dropped.

ALTER TABLE test_dp_hybrid_part_tab DROP PARTITION bgr;

COLUMN country_code FORMAT A12

SELECT country_code, COUNT(*) as amount
FROM   test_dp_hybrid_part_tab GROUP BY country_code
ORDER BY country_code;

COUNTRY_CODE     AMOUNT
------------ ----------
USA                2000

And below, the BGR data partition has been added back.

ALTER TABLE test_dp_hybrid_part_tab ADD PARTITION bgr VALUES ('BGR') EXTERNAL
DEFAULT DIRECTORY tmp_dir2 LOCATION ('bgr_xt.dmp');

COLUMN country_code FORMAT A12

SELECT country_code, COUNT(*) as amount
FROM   test_dp_hybrid_part_tab GROUP BY country_code
ORDER BY country_code;

COUNTRY_CODE     AMOUNT
------------ ----------
BGR                2000
USA                2000

Restrictions

The Restrictions for Hybrid Partitioned Tables include:

  • You can only use RANGE or LIST partitioning
  • There is no support for REFERENCE and SYSTEM partitioning

The full list of restrictions can be found in the 19c Oracle documentation. For more information see:

Conclusion

Oracle Hybrid Partitioned Tables integrate internal partitions and external partitions into a single partitioned table. Combining classical internal partitioned tables and external partitioned tables form a concept called Hybrid Partitioned Tables. Hybrid Partitioned Tables are especially appealing for lifecycle management and cheaper storage options.

By dividing a large table into smaller partitions, you can also improve query performance. Queries that access only a fraction of the data can run faster because there is less data to scan. Breaking out data by version into separate partitions allows more current data to be available in the database memory which uses database buffer pools more efficiently.