Why using R for this data mining course is a good option?
Microsoft is an active player in cloud analytics with its Azure Machine Learning Studio and Stream Analytics. Azure works with Hadoop clusters as well as with traditional relational databases. Azure ML offers a broad range of algorithms such as boosted trees and support vector machines as well as supporting R scripts and Python. Azure ML also supports a workflow interface making it more suitable for the nonprogrammer data scientist. The real-time analytics component is designed to allow streaming data from a variety of sources to be analyzed on the fly. XLMiner’s cloud version is based on Microsoft Azure. Microsoft also acquired Revolution Analytics, a major player in the R analytics business, with a view to integrating Revolution’s “R Enterprise” with SQL Server and Azure ML. R Enterprise includes extensions to R that eliminate memory limitations and take advantage of parallel processing. One drawback of the cloud-based analytics tools is a relative lack of transparency and user control over the algorithms and their parameters. In some cases, the service will simply select a single model that is a black box to the user. Another drawback is that for the most part cloud-based tools are aimed at more sophisticated data scientists who are systems savvy. Data science is playing a central role in enabling many organizations to optimize everything from production to marketing. New storage options and analytical tools promise even greater capabilities. The key is to select technology that’s appropriate for an organization’s unique goals and constraints. As always, human judgment is the most important component of a data mining solution.
This focus is on a comprehensive understanding of the different techniques and algorithms used in data mining, and less on the data management requirements of real-time deployment of data mining models. XLMiner’s short learning curve and integration with Excel makes it ideal for this purpose, and for exploration, prototyping, and piloting of solutions.
(by Herb Edelstein) page 45 in Data Mining for Business Analytics: Concepts, Techniques, and Applications in R
Answer the following questions:
1. Choose three to five software which you think are most popular and useful in Data mining and comment one their pros and cons?!
2. What are the pros and cons of Cloud-based analytics?
3. Why using R for this data mining course is a good option?
Be sure to address each point in your main post. Also, be sure to illustrate with an example (e.g., if you think clustering is relevant, describe what you think a likely cluster might contain and what the real-world meaning would be).