R(9), Data Reshaping

Published: by Creative Commons Licence

Data Reshaping in R is about changing how data is organized into rows and columns. Most of the time data processing in R is done by using input data as data frames. It is easy to extract data from the rows and columns of a data frame, but in some cases, the type of the data frame that we need is different from the one we received.

Add rows and columns into a data frame

We can use cbind() to connect multiple vectors to create a data frame. Also, we can use rbind() to merge two data frames.

cbind: column bind

rbind: row bind

# Create vector objects.
city <- c("Tampa","Seattle","Hartford","Denver")
state <- c("FL","WA","CT","CO")
zipcode <- c(33602,98104,06161,80294)

# Combine above three vectors into one data frame.
addresses <- cbind(city,state,zipcode)

# Print a header.
cat("# # # # The First data frame
") 

# Print the data frame.
print(addresses)

# Create another data frame with similar columns
new.address <- data.frame(
   city = c("Lowry","Charlotte"),
   state = c("CO","FL"),
   zipcode = c("80230","33949"),
   stringsAsFactors = FALSE
)

# Print a header.
cat("# # # The Second data frame
") 

# Print the data frame.
print(new.address)

# Combine rows form both the data frames.
all.addresses <- rbind(addresses,new.address)

# Print a header.
cat("# # # The combined data frame
") 

# Print the result.
print(all.addresses)

results:

# # # # The First data frame
    city       state zipcode
[1,] "Tampa"    "FL"  "33602"
[2,] "Seattle"  "WA"  "98104"
[3,] "Hartford" "CT"   "6161"
[4,] "Denver"   "CO"  "80294"

# # # The Second data frame
    city       state   zipcode
1      Lowry      CO      80230
2      Charlotte  FL      33949

# # # The combined data frame
    city      state zipcode
1      Tampa     FL    33602
2      Seattle   WA    98104
3      Hartford  CT     6161
4      Denver    CO    80294
5      Lowry     CO    80230
6     Charlotte  FL    33949

Merge Data Frames

If we use merge() to merge two data frames, the column name of the data frames must be the same.


Data Reshaping is a little hard to understand, understand and master it in latter practice.