How to access a Hive table using Pyspark?
Pyspark Apache Spark is an in-memory data processing framework written in Scala language. It process the data 100 times faster
Continue readingPyspark Apache Spark is an in-memory data processing framework written in Scala language. It process the data 100 times faster
Continue readingCreate database in Hive The database is an organised collection of tables.Hive has the default database with the name as
Continue readingHive JDBC Hive allows the applications to connect to it using the JDBC driver. JDBC driver uses Thrift to communicate
Continue readingExternal Table in Hive When we create the table in Hive, we can define the type of a table. As
Continue readingIntroduction Shell script can be used to run the Hive queries in batch mode. It will handle the input values/arguments
Continue readingHourly partitions in Hive table When we have large quantities of data, we look for partition column to improve the
Continue readingIf condition/statement in Hive Hive supports many conditional functions such as If, isnull, isnotnull, nvl, nullif, COALESCE and CASE. The
Continue readingSum() function in Hive Sum is one of the Aggregate function that returns the sum of the values of the
Continue readingWebHDFS WebHDFS is a protocol which is based on an industry-standard RESTful mechanism. It provides the same functionality as HDFS,
Continue readingShow databases like query in Hive Show databases or Show schemas statement is lists all the database names in Hive
Continue reading