Data Sources for Business Analytics

Today, there are more data sources and types than ever before. As these data sources and volumes continue to increase, it is the organizations that can leverage the widest variety of sources, integrating and blending them to deliver meaningful business analytics, who are getting the most value and insight from their data.

Pentaho’s business analytics platform makes it easy to bring together data from various sources and blend relational data with unstructured data for powerful analytics. We continue to innovate in order to provide highly portable and extensible data integration capabilities, effectively future-proofing your data foundation.

With Pentaho’s Adaptive Big Data Layer, you can access and integrate all big data sources including Hadoop, NoSQL, and analytic databases. Our drag and drop user interface includes hundreds of pre-built transformations that allow you to access and integrate various data sources without coding, and allows you to iteratively model and visualize data as you go.

Data Sources

Regardless of data source, analytics requirements or deployment environment, Pentaho helps you turn any data into insights. Here’s a partial list of data sources you can work with in Pentaho.

Relational Databases

Mission-critical transactions and customer records are a core part of almost all business environments, and frequently need to be integrated and blended with other data types. Pentaho allows you to connect to a wide variety of Relational Database Management Systems (RDBMS), such as:

  • MySQL
  • PostgresSQL
  • SQLite
  • Microsoft SQL Server
  • Oracle
  • IBM DB2
  • Teradata
  • Azure SQL Database
  • Azure SQL Server

Analytic Databases

Pentaho connects to a range of analytic databases, allowing users to easily answer questions against large or complex data sets. Some of these include:

  • HPE Vertica
  • Greenplum
  • Amazon Redshift
  • Teradata
  • Netezza

NoSQL Databases

More and more businesses require data stores that can quickly ingest high volume data and accommodate flexible data structures. Pentaho helps you connect to common NoSQL databases, such as:

  • Cassandra
  • MongoDB
  • Apache HBase
  • CouchDB


Pentaho empowers users to ingest, blend, and analyze diverse data at scale with Hadoop – all without writing a line of code. Pentaho works with a wide variety of Hadoop distributions and components, including:

  • Cloudera
  • Hortonworks
  • MapR
  • Amazon EMR
  • Microsoft Azure HDInsight
  • Hive
  • Impala
  • Apache Spark

Other Data Sources: Files, business applications, and More

  • JSON
  • XML
  • Amazon S3
  • Excel
  • Text Files
  • CSV Files
  • RSS Feeds
  • Business Applications (ERP, CRM)
  • Avro
  • Splunk
  • Email messages
  • Google Analytics
  • HL7 data
  • Java Message Service (JMS)
  • Fixed-width Files

To see a full list of Pentaho-supported components, visit Documentation. Learn more about Pentaho’s data integration capabilities