With mild hacks Aginity workbench for hadoop can be used with SparkSQL as a client to run queries.
How do I get Spark SQL?
If you are looking for deploying a herd of elephants to churn the forest, then you already know it. If you are looking for an infant elephant to snuggle and get a feel, hortonworks sandbox 2.3 would do it. Download a 2.3 sandbox from hortonworks . I have another post about getting host only network setup with Virtualbox and hortonworks sandbox Link
Please note: Without Host only network, external clients can not connect to the sandbox. Make sure the sandbox is configured correctly. Link
Assuming: We have a working environment now.
Some Concepts (Feel free to correct me through comments if I am wrong)
Confirm Hive is working
If you are looking for deploying a herd of elephants to churn the forest, then you already know it. If you are looking for an infant elephant to snuggle and get a feel, hortonworks sandbox 2.3 would do it. Download a 2.3 sandbox from hortonworks . I have another post about getting host only network setup with Virtualbox and hortonworks sandbox Link
Please note: Without Host only network, external clients can not connect to the sandbox. Make sure the sandbox is configured correctly. Link
Assuming: We have a working environment now.
Some Concepts (Feel free to correct me through comments if I am wrong)
- For hive clients or spark clients to talk and issue commands against said services, a thrift service is required. E.g Hiveserver / Hiveserver2 are thrift services for external clients for hive
- For spark sql to be accessible to external clients, spark thrift service has to run
- For Spark SQL to work, spark should know about hive meta store. So we have to keep Spark informed about the hive metastore configuration.
Confirm Hive is working
Start a hive shell. And fire a query
Before brining up spark shell, it would advised to change the logging level from info to warn as otherwise spark spits out a lot of text to console.
Open the file /usr/hdp/current/spark-client/conf/log4j.properties and on the first line change the log level from INFO to WARN.
After the change my file looks like below
Fire up a spark-sql shell and fire some commands to know that spark is working fine. Following screenshot from my environment confirms that its working.
Note: If you have not changed your log level as mentioned above, the below screen would look very different
- On the hortonworks sandbox use ALT + F5 and then log in using root/hadoop
- issue the command "hive" to fire up a shell.
- "show tables;" should show the tables in default schema.
- "select count(*) from sample_07;" one of the existing tables should confirm hive is working
Confirm Spark is working
Before brining up spark shell, it would advised to change the logging level from info to warn as otherwise spark spits out a lot of text to console.
Open the file /usr/hdp/current/spark-client/conf/log4j.properties and on the first line change the log level from INFO to WARN.
After the change my file looks like below
Fire up a spark-sql shell and fire some commands to know that spark is working fine. Following screenshot from my environment confirms that its working.
Note: If you have not changed your log level as mentioned above, the below screen would look very different
Start Spark thrift service
To avoid thrift server for hive and spark not to collide on same port we need to start the spark thrift on a different port.
Run the following command.
To avoid thrift server for hive and spark not to collide on same port we need to start the spark thrift on a different port.
Run the following command.
start-thriftserver.sh --master yarn --executor-memory 512m --hiveconf hive.server2.thrift.port=11000
Notice the output log file mentioned in the screenshot above
run the following command to tail the log to know that spark-sql is actually working
tail -f /var/log/spark/spark-root-org.apache.spark.sql.hive.thriftserver.HiveThriftServer2-1-sandbox.hortonworks.com.out
So now we have spark sql thrift service is running. Time to connect the client.
Use Aginity Workbench to connect to Spark
Get Aginity Workbench for hadoop if you do not have it already. Its free.
Notice below the port number 12000. By default its 10000. Since while starting the thrift service we changed it to 12000, we would connect on 12000
After connecting, I could run a query and see the results. The screenshot below confirms that
How do we know that spark-sql is actually running and giving us the results.
Look at the log tail that we ran before... here is a screenshot.
So enjoy spark SQL on a nice client interface.
Thusly, viable focusing on plans are arranged that can be substantial markdown on each out-of-station ride. data science course in pune
ReplyDeletegreat post , thank you for posting the content.
ReplyDeletemachine learning institute in bangalore
ReplyDeleteExcelr is providing emerging & trending technology training, such as for data science, Machine learning, Artificial Intelligence, AWS, Tableau, Digital Marketing. Excelr is standing as a leader in providing quality training on top demanding technologies in 2019. Excelr`s versatile training is making a huge difference all across the globe. Enable ?business analytics? skills in you, and the trainers who were delivering training on these are industry stalwarts. Get certification on "data science training institutes in hyderabad"and get trained with Excelr.
Such a very useful Blog. Very interesting to read this article. I have learn some new information.thanks for sharing. know more about
ReplyDeleteThis is also a very good post which I really enjoyed reading. It is not every day that I have the possibility to see something like this..
ReplyDeleteExcelR data analytics
I have been searching to find a comfort or effective procedure to complete this process and I think this is the most suitable way to do it effectively.
ReplyDeletePlease check ExcelR Data Science Courses
I was just browsing through the internet looking for some information and came across your blog. I am impressed by the information that you have on this blog. It shows how well you understand this subject. Bookmarked this page, will come back for more.
ReplyDeletebusiness analytics course
data analytics courses
data science interview questions
data science course in mumbai
The information provided on the site is informative. Looking forward more such blogs. Thanks for sharing .
ReplyDeleteArtificial Inteligence course in Patna
AI Course in Patna
Data Analytics Course in Pune
ReplyDeleteI like viewing web sites which comprehend the price of delivering the excellent useful resource free of charge. I truly adored reading your posting. Thank you!
Cool stuff you have, and you keep overhaul every one of us.
Great information!! Thanks for sharing nice blog.
ReplyDeleteData Science Course in Hyderabad
Excellent article. Very interesting to read. I really love to read such a nice article. Thanks! keep rocking Best data science courses in hyerabad
ReplyDeleteYou'll find polo Ron Lauren inside exclusive array which include particular classes for men, women. PMP Certification Pune
ReplyDeleteI am always searching online for articles that can help me. There is obviously a lot to know about this. I think you made some good points in Features also. Keep working, great job !
I feel very grateful that I read this. It is very helpful and very informative and I really learned a lot from it.
ReplyDeleteDigital Marketing company
This is great information and we are very helpful and will always appreciate for your hardwork.
ReplyDeleteData science courses in Ghana
This post provides a clear and informative introduction to setting up the Hortonworks Sandbox, making it accessible even for beginners. Your analogy of choosing between a herd of elephants and an infant elephant is both clever and engaging! The step-by-step guidance on configuring the host-only network and the emphasis on the importance of the thrift service for Hive and Spark clients are particularly valuable. It's a great resource for anyone looking to explore big data technologies. Keep up the excellent work!
ReplyDeletedata analytics courses in dubai
Your guide on using Spark SQL with a query editor is incredibly helpful! It simplifies the process and makes working with Spark SQL much more accessible. Great work in making complex tools easier to understand! Keep sharing these valuable insights!
ReplyDeleteData Science Courses in Singapore
What a great introduction to using Spark SQL with a query editor! It’s fantastic to see how accessible Spark SQL has become for those who prefer a more interactive approach to data querying. The step-by-step guidance is super helpful, especially for beginners. I love the idea of leveraging a query editor to streamline workflows and enhance productivity. Can’t wait to try this out in my next data project! Thanks for sharing these insights!
ReplyDeleteOnline Data Science Course
This blog post provides a fantastic overview of the Spark SQL client and its powerful features. I appreciate how it highlights the elegance
ReplyDeleteData science courses in Bhutan
Nice article about Spark SQL. Will be useful to many developers. I found it to be very informative. Its quite impressive the way you have explained it.
ReplyDeleteData science courses in Kochi
A very useful information.Thanks for sharing.
ReplyDeleteData Science Courses in Hauz Khas
"I just discovered the Data Science Course in Dadar, and I’m impressed!
ReplyDeleteThe course seems to offer a solid foundation in data science skills.
I appreciate the emphasis on hands-on experience.
This could be a great opportunity for anyone looking to advance their career.
I’ll definitely be checking this out!"
This is a fantastic guide on setting up Spark SQL with Aginity Workbench! Your step-by-step instructions make it clear and accessible, especially for those new to working with Spark and Hive. I appreciate the emphasis on configuring the host-only network in the Hortonworks sandbox; that’s often a crucial detail that can trip people up. Data science courses in Mysore
ReplyDeleteUsing Spark SQL with a query editor is a fantastic way to leverage the power of distributed data processing while maintaining the simplicity and familiarity of SQL. It allows data engineers and analysts to run complex queries directly on large datasets without needing to write extensive Spark code. This approach is particularly useful for exploratory data analysis, as users can quickly inspect data patterns, join tables, and aggregate data. The query editor also provides a more interactive experience, making it easier to iterate, debug, and optimize queries on the go.
ReplyDeleteData science Courses in Germany
"Such an informative article! The growth of data science courses in Iraq is a great step forward for both professionals and students looking to enter this booming industry. Highly recommend exploring Data science courses in Iraq to get started on your data science journey!"
ReplyDeleteThank you, for sharing this detailed guide on using Spark SQL with Aginity Workbench. Your insights on configuring the Spark Thrift service, verifying setups, and managing logging levels are incredibly useful for making Spark SQL more accessible through a client interface. This is particularly helpful for streamlining query execution and improving productivity in big data projects. Great work!
ReplyDeleteData science course in Lucknow
This blog offers a detailed look at using Spark SQL Client for elegant and efficient SQL operations. A valuable resource for developers working with big data and Spark frameworks!
ReplyDeleteData science course in Gurgaon
Great blog on Spark SQL Client!It’s a helpful resource for anyone looking to get started with big data processing using Spark.
ReplyDeleteData science course in Bangalore
Thank you for sharing such valuable advice. I’m going to start applying your tips on Spark SQL Client- Do Spark SQL Using a Query Editor right away. Such a valuable article.
ReplyDeleteData Science Courses in China
This is a fantastic post on Spark SQL! The way you explain the elegance of Spark SQL and how it improves the querying process is spot on. I’ve been working with Spark lately, and your examples and insights really help clear up a lot of the complexities. Keep up the great work!
ReplyDeleteData science courses in chennai
Great post! Your explanation of using Spark SQL for elegant data processing is insightful and well-articulated. It’s a fantastic resource for anyone looking to leverage Spark SQL in their projects. Thanks for sharing such useful tips and techniques!
ReplyDeleteData science courses in Bangladesh
Great post! I loved how you highlighted the elegance and efficiency of Spark SQL. The use of DataFrames and integration with Spark’s powerful processing engine really shows how Spark can make big data handling more intuitive. Looking forward to more insights like this!
ReplyDeleteData science courses in pune
The Spark SQL Client empowers users to execute Spark SQL queries seamlessly through a query editor, simplifying data exploration and analysis. This tool bridges the gap between developers and analysts, providing an intuitive interface for interacting with large datasets, optimizing workflows, and unlocking Spark's full potential for big data analytics.
ReplyDeleteThank you for the article.
business analyst course in bangalore
I really love this blog, thank you for sharing.
ReplyDeleteMedical Coding Course
super blog!
ReplyDeletedigital marketing agency in nagpur
Thanks for guiding, Your explanation and examples are really helpful!
ReplyDeleteMedical Coding Courses in Chennai
Your article is highly informative and engaging. I truly appreciate the depth of knowledge you've shared—thank you for the valuable insights! If you're searching for dependable cloud hosting and IT solutions, OneUp Networks is a great option. They offer a variety of tailored services designed to meet different business requirements. Explore the details below:
ReplyDeleteOneUp Networks
CPA Hosting
QuickBooks Hosting
QuickBooks Enterprise Hosting
Sage Hosting
Wolters Kluwer Hosting
Thomson Reuters Hosting
Thomson Reuters UltraTax CS Cloud Hosting
Fishbowl App Inventory Cloud Hosting
Cybersecurity
Check out these links for more information on their advanced hosting and security solutions. Keep up
Structured Query Language (SQL) is a domain-specific language used to manage data, especially in a relational database management system.
ReplyDeleteMedical Coding Courses in Bangalore
This post offers practical guidance for setting up and using Spark SQL with helpful explanations of key concepts, making it a fantastic resource for anyone diving into the world of big data processing. Great work.
ReplyDeletehttps://iimskills.com/medical-coding-courses-in-delhi/
Great tutorial on integrating Spark SQL with Aginity Workbench! The step-by-step instructions make setup clear and user-friendly, enabling smooth querying and effective data management. A valuable resource for big data professionals!
ReplyDeleteMedical coding courses in Delhi/
"Their Business Analytics course was amazing! The tools and techniques I learned helped me solve real business problems in my workplace."
ReplyDeleteMedical Coding Courses in Coimbatore
This post helped me realize how much I still need to learn. Thanks for the motivation!
ReplyDeleteMedical Coding Courses in Chennai
Your blog is always full of valuable information! I always leave with new insights and things to think about. Lately, I’ve been researching medical coding as a career and found a Medical Coding Course in Delhi that looks interesting.
ReplyDeleteMedical Coding Courses in Delhi
This was such a helpful post. I’m definitely going to refer back to it in the future." Medical Coding Courses in Delhi
ReplyDeleteThanks for this amazing post! Really helpful.
ReplyDeleteMedical Coding Courses in Delhi
Great Explanation,thanks for sharing
ReplyDeleteMedical Coding Courses in Delhi
Excellent post!
ReplyDeleteMedical Coding Courses in Bangalore
I am looking forward to reading new articles.
ReplyDeleteMedical Coding Courses in Delhi
I have been searching to find a comfort or effective procedure to complete this process and I think this is the most suitable way to do it effectively.
ReplyDeletehttps://iimskills.com/medical-coding-courses-in-bangalore/
Thank you for writing down such a wonderful piece of content writing. I really eulogize your insights. I have come across a lot of appealing piece of information in this article that is bold.
ReplyDeleteSAP training in Kolkata
SAP training in Mumbai
SAP training courses in Mumbai
Data Science training in Mumbai
SAP training in Pune
Data Science training in Pune
. Looking forward more such blogs. Thanks for sharing .
ReplyDeleteData Science Courses in India