Jump to content

Sqoop: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
No edit summary
DN tag
Line 20: Line 20:
Sqoop became a top-level [[Apache Software Foundation|Apache]] project in March 2012.<ref>{{cite web |url=https://blogs.apache.org/sqoop/entry/apache_sqoop_graduates_from_incubator|title=Apache Sqoop Graduates from Incubator|accessdate=Sep 8, 2012}}</ref>
Sqoop became a top-level [[Apache Software Foundation|Apache]] project in March 2012.<ref>{{cite web |url=https://blogs.apache.org/sqoop/entry/apache_sqoop_graduates_from_incubator|title=Apache Sqoop Graduates from Incubator|accessdate=Sep 8, 2012}}</ref>


[[Informatica Big Data Management]] provides Sqoop based [[connector]] from version 10.1. Informatica supports both Sqoop Import and Export, which is often used with Data Integration use cases on Hadoop.
[[Informatica Big Data Management]] provides Sqoop based [[connector]]{{dn|date=November 2016}} from version 10.1. Informatica supports both Sqoop Import and Export, which is often used with Data Integration use cases on Hadoop.


[[Pentaho]] provides [[open source]] Sqoop based connector steps, ''Sqoop Import''<ref name="2015-12-10_PSI" /> and ''Sqoop Export'',<ref name="2015-12-10_PSE"/> in their [[Extract, transform, load|ETL]] suite [[Pentaho Data Integration]] since version 4.5 of the software.<ref name="2012-07-27_dbta" /> [[Microsoft]] uses a Sqoop-based connector to help transfer data from [[Microsoft SQL Server]] databases to Hadoop.<ref>{{cite web |url=https://www.microsoft.com/en-us/download/details.aspx?id=27584|title=Microsoft SQL Server Connector for Apache Hadoop|accessdate=Sep 8, 2012}}</ref>
[[Pentaho]] provides [[open source]] Sqoop based connector steps, ''Sqoop Import''<ref name="2015-12-10_PSI" /> and ''Sqoop Export'',<ref name="2015-12-10_PSE"/> in their [[Extract, transform, load|ETL]] suite [[Pentaho Data Integration]] since version 4.5 of the software.<ref name="2012-07-27_dbta" /> [[Microsoft]] uses a Sqoop-based connector to help transfer data from [[Microsoft SQL Server]] databases to Hadoop.<ref>{{cite web |url=https://www.microsoft.com/en-us/download/details.aspx?id=27584|title=Microsoft SQL Server Connector for Apache Hadoop|accessdate=Sep 8, 2012}}</ref>

Revision as of 20:18, 5 November 2016

Apache Sqoop
Developer(s)Apache Software Foundation
Stable release
1.4.6 / May 11, 2015 (2015-05-11)
Repository
Written inJava
Operating systemCross-platform
TypeData management
LicenseApache License 2.0
Websitesqoop.apache.org

Sqoop is a command-line interface application for transferring data between relational databases and Hadoop.[1] It supports incremental loads of a single table or a free form SQL query as well as saved jobs which can be run multiple times to import updates made to a database since the last import. Imports can also be used to populate tables in Hive or HBase.[2] Exports can be used to put data from Hadoop into a relational database. Sqoop got the name from sql+hadoop. Sqoop became a top-level Apache project in March 2012.[3]

Informatica Big Data Management provides Sqoop based connector[disambiguation needed] from version 10.1. Informatica supports both Sqoop Import and Export, which is often used with Data Integration use cases on Hadoop.

Pentaho provides open source Sqoop based connector steps, Sqoop Import[4] and Sqoop Export,[5] in their ETL suite Pentaho Data Integration since version 4.5 of the software.[6] Microsoft uses a Sqoop-based connector to help transfer data from Microsoft SQL Server databases to Hadoop.[7] Couchbase, Inc. also provides a Couchbase Server-Hadoop connector by means of Sqoop.[8]

In 2015 Ralph Kimball described Sqoop as follows under the heading The Future of ETL:[9]

Several big changes must take place in the ETL environment. First, the data feeds from original sources must support huge bandwidths, at least gigabytes per second. Learn about Sqoop loading data into Hadoop. If these words mean nothing to you, you have some reading to do! Start with Wikipedia.

See also

References

  1. ^ "Hadoop: Apache Sqoop". Retrieved Sep 8, 2012.
  2. ^ "Apache Sqoop - Overview". Retrieved Sep 8, 2012.
  3. ^ "Apache Sqoop Graduates from Incubator". Retrieved Sep 8, 2012.
  4. ^ "Sqoop Import". Pentaho. 2015-12-10. Archived from the original on 2015-12-10. Retrieved 2015-12-10. The Sqoop Import job allows you to import data from a relational database into the Hadoop Distributed File System (HDFS) using Apache Sqoop.
  5. ^ "Sqoop Export". Pentaho. 2015-12-10. Archived from the original on 2015-12-10. Retrieved 2015-12-10. The Sqoop Export job allows you to export data from Hadoop into an RDBMS using Apache Sqoop.
  6. ^ "Big Data Analytics Vendor Pentaho Announces Tighter Integration with Cloudera; Extends Visual Interface to Include Hadoop Sqoop and Oozie". Database Trends and Applications (dbta.com). 2012-07-27. Archived from the original on 2015-12-08. Retrieved 2015-12-08. Pentaho's Business Analytics 4.5 is now certified on Cloudera's latest releases, Cloudera Enterprise 4.0 and CDH4. Pentaho also announced that its visual design studio capabilities have been extended to the Sqoop and Oozie components of Hadoop.
  7. ^ "Microsoft SQL Server Connector for Apache Hadoop". Retrieved Sep 8, 2012.
  8. ^ "Couchbase Hadoop Connector". Retrieved Sep 8, 2012.
  9. ^ Kimball, Ralph (2015-12-01). "Design Tip #180 The Future Is Bright". Kimball Group. Archived from the original on 2015-12-03. Retrieved 2015-12-03. Several big changes must take place in the ETL environment. First, the data feeds from original sources must support huge bandwidths, at least gigabytes per second. Learn about Sqoop loading data into Hadoop. If these words mean nothing to you, you have some reading to do! Start with Wikipedia.

Bibliography