Assistance required in the following issue.
Have created hive table through spark data frame from the csv files that are in HDFS path.
After creating hive table, used the below query and result showing as zero records. But the same query is working fine in Hue and giving records count. Kindly let me know what is the issue here.
df = tbl(sc,sql(" SELECT * FROM table1 LIMIT 100"))
df %>% count()
Source: lazy query [?? x 1]
Database: spark_connection
n
0
Thanks and Regards
Sankar Narayana
Hi, are you querying the exact same database inside Hive?
Yeah i am querying the same database.
On other database tables it is working fine. Only on tables that are created from CSV or text files it is not working.
1 Like
maybe try
df %>%
summarize(cnt = n())
Got same result with zero
df %>%
summarize(cnt = n())
Source: lazy query [?? x 1]
Database: spark_connection
cnt
0
All my configuration is right. And SQL running well on Hive.
This hive table have been created from text files using spark_read_csv command with "\t" as delimiter.
could you see the hive meta info?
make sure you are connecting to hive very well.
df %>%
tbl_vars()