In this post, a simple way of internal load balancing is demonstrated by redirecting multiple same applications, depending on the number of processes binded to them
In this post, a way to overcome one of R's limitations of lack of multi-threading is discussed by job queuing using the jobqueue package
One option to boost SparkR's performance as a data processing engine is manipulating data in Hive Context rather than in limited SQL Context. In this post, we discuss how to run SparkR in Hive Context.
In this post, we discuss how to execute SparkR in a local and cluster mode.
We discuss how to set up a Spark cluser between 2 Ubuntu guests. Firstly it begins with machine preparation.
We discuss how to make use of Python outcomes in R using a package.
An article that motivates the benefits of Python for R users.
Introduction to Python
Setting up random seed is important for reproducibility of analysis. In this post, we discuss how to generate random seed using the caret package.
We discuss how to turn analysis into an R package.