What Is Oracle Big Data SQL?
Oracle Big Data SQL is a query layer that allows Oracle Database to read data directly from Hadoop Distributed File System (HDFS), Apache Hive, Apache Kafka, Oracle NoSQL Database, and other non-relational data stores. Rather than moving data into Oracle Database, Big Data SQL pushes query execution down into the Hadoop cluster — known as Smart Scan — and returns results to Oracle. This enables organisations to run Oracle SQL and PL/SQL queries across multi-source environments without extracting data from the data lake.
From a business value perspective, Big Data SQL allows Oracle Database users — including applications, BI tools and reporting layers — to access Hadoop-resident data without rewriting queries or building complex ETL pipelines. The product bridges the traditional Oracle relational world with modern data lake architecture.
Why Big Data SQL Licensing Is Unusual
Almost every Oracle Technology product is licensed using one of two metrics: Named User Plus (per authorised user) or Processor (per CPU core, adjusted by Oracle's core factor table). Big Data SQL uses neither. Its metric is disk drives in the Hadoop cluster — an infrastructure-level measurement that reflects the storage footprint of the data lake being queried, not the number of users or processors on Oracle Database side.
This distinction matters enormously for compliance. Organisations that estimate their Big Data SQL licence requirement using processor or user counts will reach completely wrong answers. The correct starting point is a physical drive inventory across every data node in the cluster.
Deploying Oracle Big Data SQL?
We identify licence gaps and model total cost before your first Oracle audit letter arrives.The Disk Drive Licensing Model Explained
Oracle Big Data SQL must be licensed for every disk drive in the Hadoop cluster where it is installed and running. Several rules govern exactly what "in the cluster" means.
All Data Nodes Must Be Fully Licensed
Oracle's licensing rules do not permit partial cluster licensing. If Oracle Big Data SQL is installed and operational on a Hadoop cluster, every data node in that cluster must be licensed, and every disk drive within each data node must be counted. There is no concept of licensing only the nodes that Big Data SQL directly queries at a given moment — the entire cluster falls within scope.
This all-or-nothing requirement is one of the most impactful aspects of Big Data SQL licensing. Organisations that expand their Hadoop cluster by adding new data nodes — common in growing data lake environments — automatically expand their Big Data SQL licence requirement. Each new node's drives must be licensed before that node becomes operational in the cluster.
External Data Sources Are Also In Scope
Beyond HDFS data node drives, Oracle's Big Data SQL licence requirement extends to disks used by external data sources that Big Data SQL queries. This includes Kafka nodes providing streaming data, Oracle NoSQL Database storage nodes, and other integrated external systems. Many organisations discover that the true drive count is substantially higher than their initial HDFS-only estimate once external source storage is included.
A Separate Licence Per Cluster
Oracle Big Data SQL requires a separate licence for each Hadoop cluster. If your organisation runs multiple independent Hadoop clusters — common in enterprises that maintain separate production, staging and analytics clusters — each cluster requires its own Big Data SQL licence. The drive count is calculated independently per cluster, and licences cannot be transferred between clusters without Oracle's authorisation.
Oracle Copy to Hadoop Is Included
One cost-mitigating factor: Oracle Copy to Hadoop, a tool that exports data from Oracle Database to HDFS, is included in the Big Data SQL licence at no additional cost. Organisations that make regular use of Copy to Hadoop as part of their data pipeline would otherwise need to acquire it separately.
How to Calculate Your Big Data SQL Licence Requirement
Calculating your Big Data SQL licence obligation requires a systematic hardware inventory. Follow this process to arrive at an accurate count.
Step 1: Identify All Clusters Running Big Data SQL
Confirm which Hadoop clusters have Oracle Big Data SQL installed and configured. Even clusters where Big Data SQL is installed but not actively used are within licence scope — installation, not active usage, is what triggers the licence requirement.
Step 2: Inventory All Data Nodes Per Cluster
For each cluster, list every data node. In large Hadoop deployments, data nodes may be spread across multiple racks and availability zones. The node count must include all current members of the cluster, including any recently added nodes.
Step 3: Count Physical Drives Per Node
Count every physical disk drive installed in each data node. This typically means local HDDs or SSDs configured as HDFS data drives. Note that spare drives or hot-standby drives installed in nodes that are part of the cluster may also be in scope, depending on Oracle's interpretation of the applicable licence agreement language. When in doubt, include them — the cost of under-counting is greater than the cost of over-counting.
Step 4: Add External Source Drives
Identify any Kafka nodes, Oracle NoSQL storage nodes, or other external systems that Big Data SQL is configured to query. Add the drive count from those nodes to your total.
Step 5: Multiply by Current Licence Price
Multiply your total drive count by the Big Data SQL per-drive list price and apply any negotiated discount. Verify your entitlement certificate against this calculated requirement to identify any shortfall before Oracle does.
Compliance Risks and Audit Exposure
Oracle Big Data SQL is within Oracle LMS (Licence Management Services) audit scope. Several patterns create disproportionate compliance risk in practice.
Cluster Expansion Without Licence Top-Up
The most common compliance gap we see is organisations that scale Hadoop clusters dynamically — adding data nodes in response to storage and processing demands — without corresponding licence additions. In environments where cluster capacity is managed by platform engineering teams who are separate from software asset management, new nodes are routinely added without any licence review. Over 12 to 24 months, an initially compliant deployment can become significantly under-licensed as the cluster grows.
Multiple Cluster Deployments
Organisations that run separate Hadoop clusters for different purposes — one for raw data ingestion, one for processed analytics data, one for machine learning workloads — must licence each cluster independently. Teams that acquired a single Big Data SQL licence and deployed it across multiple clusters are in breach of the one-licence-per-cluster requirement.
Test and Development Clusters
Oracle's standard licence terms do not automatically exempt test, development or staging clusters from the Big Data SQL licence requirement. Unless your licence agreement explicitly includes a development-use provision, every cluster running Big Data SQL — including non-production environments — requires a licence. Many organisations that have licensed Big Data SQL only for production are running unlicensed installations in dev and test.
Oracle LMS Scripts and Audit Mechanics
In an Oracle audit, Oracle LMS typically requests hardware inventory documentation for every Hadoop cluster. Oracle's collection scripts can enumerate data nodes and, in some configurations, identify the installed drive count. Organisations that cannot produce accurate, current drive inventories per cluster face the risk of Oracle constructing an audit claim from Oracle's own discovery data rather than from accurate customer-supplied information. Maintaining current cluster documentation is both a compliance practice and an audit-defence measure.
Support Costs and Annual Escalation
Big Data SQL is licensed perpetually with annual support fees at approximately 22% of the net licence price paid. Like all Oracle technology products, Oracle support fees increase by 8% per year. This escalation applies to Big Data SQL support in the same way it applies to Oracle Database and other technology licences — it is not discretionary and is embedded in Oracle's standard support agreement terms.
For large Hadoop deployments with hundreds of licensed drives, the compounding effect of 8% annual support escalation is material. Organisations should model their five-year Big Data SQL support trajectory explicitly when making investment decisions about large-scale data lake deployments.
Cost-Reduction Strategies
Several approaches can reduce the total cost of Oracle Big Data SQL deployments without compromising functionality.
Consolidate Clusters to Reduce Licence Count
Rather than maintaining separate Big Data SQL licences for multiple purpose-specific clusters, organisations may be able to consolidate workloads onto fewer, larger clusters. This reduces the per-cluster licence overhead and simplifies compliance management. Consolidation requires capacity planning and workload analysis but is often viable for organisations that have proliferated clusters over time without architectural governance.
Minimise External Source Drive Scope
Where possible, structuring data pipelines so that Big Data SQL queries HDFS data rather than directly querying external Kafka or NoSQL sources may reduce the in-scope drive count. This requires architectural analysis but can meaningfully reduce licence requirements in data-rich environments.
Negotiate at Oracle's Q4 Window
Oracle's fiscal year ends 31 May, with the Q4 window running March to May. Technology product deals — including Big Data SQL licence expansions and new deployments — attract better commercial terms during this period. Oracle's field sales teams work to close technology deals in Q4 to meet annual targets, creating pricing leverage that is not available at other times of the year. Aligning Big Data SQL licence top-ups or new cluster deployments with the Q4 window consistently delivers material discounts. Work with Oracle licensing advisory specialists to time and structure each deal correctly.
Bundle Big Data SQL Into Broader Oracle Negotiations
Big Data SQL purchased as a standalone product typically attracts standard discount levels. Organisations that include Big Data SQL in the context of broader Oracle negotiations — Oracle Database renewals, Oracle platform agreements, or cloud commitments — can leverage the total deal value to improve the Big Data SQL discount. Standalone procurement almost always produces a worse commercial outcome than bundled negotiation.
Proactive Internal Compliance Review
Given the dynamic nature of Hadoop cluster scaling, a quarterly compliance review of Big Data SQL licence coverage is advisable for organisations with active data lake environments. A rolling audit of node counts, drive inventories and licence entitlements prevents small gaps from compounding into large audit exposures. The cost of a proactive internal review is a fraction of the cost of remediating an Oracle-initiated audit finding.
Stay Current on Oracle Big Data Licensing
Oracle's data platform licensing rules evolve with each product release. Subscribe for quarterly updates covering Big Data SQL, Oracle Database and data lake compliance changes.