Cloudera Related Exams
CCA175 Exam
Problem Scenario 10 : You have been given following mysql database details as well as other info.
user=retail_dba
password=cloudera
database=retail_db
jdbc URL = jdbc:mysql://quickstart:3306/retail_db
Please accomplish following.
1. Create a database named hadoopexam and then create a table named departments in it, with following fields. department_id int,
department_name string
e.g. location should be hdfs://quickstart.cloudera:8020/user/hive/warehouse/hadoopexam.db/departments
2. Please import data in existing table created above from retaidb.departments into hive table hadoopexam.departments.
3. Please import data in a non-existing table, means while importing create hive table named hadoopexam.departments_new
Problem Scenario 86 : In Continuation of previous question, please accomplish following activities.
1. Select Maximum, minimum, average , Standard Deviation, and total quantity.
2. Select minimum and maximum price for each product code.
3. Select Maximum, minimum, average , Standard Deviation, and total quantity for each product code, hwoever make sure Average and Standard deviation will have maximum two decimal values.
4. Select all the product code and average price only where product count is more than or equal to 3.
5. Select maximum, minimum , average and total of all the products for each code. Also produce the same across all the products.
Problem Scenario 45 : You have been given 2 files , with the content as given Below
(spark12/technology.txt)
(spark12/salary.txt)
(spark12/technology.txt)
first,last,technology
Amit,Jain,java
Lokesh,kumar,unix
Mithun,kale,spark
Rajni,vekat,hadoop
Rahul,Yadav,scala
(spark12/salary.txt)
first,last,salary
Amit,Jain,100000
Lokesh,kumar,95000
Mithun,kale,150000
Rajni,vekat,154000
Rahul,Yadav,120000
Write a Spark program, which will join the data based on first and last name and save the joined results in following format, first Last.technology.salary