Uncategorized

Data aggregation using dplyr package in R (sample snippet)

Posted on Updated on

In this post, I am using SuperStore data to explore some of the data wrangling functions from these two packages.

The data is in the form of an Excel workbook with three sheets namely – Orders, Returns, and Region. I am loading the three different sheets into separate datasets into R, joining them and performing necessary aggregations.

#----------------------------------------------------------------------
# Author: Naveen Goje, Email: naveen.goje@gmail.com
#----------------------------------------------------------------------

setwd("C:/R")

#----------------------------------------------------------------------
# 1 Load required libraries 
#----------------------------------------------------------------------

#For accessing and dumping excel files
install.packages("openxlsx")
library(openxlsx) 
#Used for data wrangling
install.packages("dplyr")
library(dplyr) 
#Used for data wrangling
install.packages("tidyr")
library(tidyr)

#----------------------------------------------------------------------
# 2 Load three individuals sheets into separate datasets 
#----------------------------------------------------------------------
superstore.wb

datastore

# Sum of Sales by Product Category 
data.storeOrders%>%
 group_by(Product.Category) %>%
 summarise(Total.Sales = sum(Sales)) %>%
 arrange(Total.Sales)

aggr_1

#----------------------------------------------------------------------
# 3 Join data sets and aggregate as per requirement
#----------------------------------------------------------------------
# Inner join the two data sets Order and Users by Region and look at Total Sales by Region
data.final %
 group_by(Region) %>%
 summarise(Total.Sales = sum(Sales))
data.final

aggr_2

Spotfire: Passing value from HTML/JavaScript to Document Property

Posted on Updated on

Reference snippet to pass a radio button value into Spotfire context, that is into a Document Property.



<FORM><B>Select date for filtering:</B>
<P>
<INPUT id=”pack1″
onclick=”document.getElementById(‘145de349837d49f493cc9ec23743b05f’).value=document.getElementById(‘pack1’).value”
value=”N” CHECKED type=radio name=pack1>
Date 1<BR>
<INPUT id=”pack2″
onclick=”document.getElementById(‘145de349837d49f493cc9ec23743b05f’).value=document.getElementById(‘pack2’).value”
value=”Y” type=radio name=pack1>Date 2
</FORM>
</P>


HTML to Document Property

Link to sample Spotfire analyses

R Shiny – Histogram

Posted on

Using Shiny Package in R to develop a Histogram with binning.

R_Shiny_Histogram

Reference:

Histogram – RStudio Shiny

Going beyond base Tableau – Unofficial Add-ons

Posted on Updated on

Advanced data visualization tools such as Tableau and Spotfire bundle capabilities such as Table calculations and Custom expressions respectively and make new dimensions of viewing data possible. Further, interactivity and customization which could be brought into dashboards with the use of parameters (in Tableau) and property controls (in Spotfire) set apart these visual analytic products from conventional BI query and analysis tools.

However, apart from out-of-the-box functionality, more could be achieved using these tools with some minor tweaks. One such example is to use phpGrid in a webpage (as part of the Tableau dashboard) as a user interface to show dynamic views depending on the data entered on the grid.

Below is a simple gantt chart which has parameters to filter data. Interactivity is limited to filtering data depending on the filters which only work on the data from the source file.

Gantt_Chart_1

However, there are a couple of ways we could update the data into a database with these parameters. First is to use a phpgrid to interact with the database using the refresh button action. And another method is to use the GET method (to pass parameters as part of the URL) using the dashboard webpage.

By Ryan Robitaille

http://ryrobes.com/tableau/tableau-phpgrid-an-almost-instant-gratification-data-entry-tool/

By Russell Christopher

http://tableaulove.tumblr.com/post/27627548817/another-method-to-update-data-from-inside-tableau

Guided Selling using R

Posted on Updated on

# Setting current working directory
getwd()
setwd("D:/R")
getwd()
# Read data from Last.FM frequency matrix
data.germany head(data.germany[,c(1,3:8)])

# Drop the user column and make a new data frame
data.germany.ibs

# Create a helper function to calculate the cosine between two vectors
getCosine {
this.cosine return(this.cosine)
}

# Create a placeholder dataframe listing item vs. item
holder data.germany.ibs.similarity

# Lets fill in those empty spaces with cosine similarities
for(i in 1:ncol(data.germany.ibs)) {
for(j in 1:ncol(data.germany.ibs)) {
data.germany.ibs.similarity[i,j]= getCosine(data.germany.ibs[i],data.germany.ibs[j])
}
}

# Output similarity results to a file
write.csv(data.germany.ibs.similarity,file="final-germany-similarity.csv")

# Get the top 10 neighbours for each
data.germany.neighbours

for(i in 1:ncol(data.germany.ibs))
{
data.germany.neighbours[i,] }

# Output neighbour results to a file
write.csv(file="final-germany-item-neighbours.csv",x=data.germany.neighbours[,-1])