1 Jul 2012 19:55

## list to dataframe conversion-testing for identical

```HI R help,

I was trying to get identical data frame from a list using two methods.

#Suppose my list is:
listdat1<-list(rnorm(10,20),rep(LETTERS[1:2],5),rep(1:5,2))
#Creating dataframe using cbind

dat1<-data.frame(do.call("cbind",listdat1))
colnames(dat1)<-c("Var1","Var2","Var3")
#Second dataframe conversion

dat2<-data.frame(Var1=listdat1[[1]],Var2=listdat1[[2]],Var3=listdat1[[3]])

#Structure is different in two datasets
>str(dat1)
'data.frame':    10 obs. of  3 variables:
\$ Var1: Factor w/ 10 levels "18.6153321029756",..: 5 2 6 8 7 9 1 4 3 10
\$ Var2: Factor w/ 2 levels "A","B": 1 2 1 2 1 2 1 2 1 2
\$ Var3: Factor w/ 5 levels "1","2","3","4",..: 1 2 3 4 5 1 2 3 4 5
> str(dat2)
'data.frame':    10 obs. of  3 variables:
\$ Var1: num  20.3 19.2 20.5 20.9 20.5 ...
\$ Var2: Factor w/ 2 levels "A","B": 1 2 1 2 1 2 1 2 1 2
\$ Var3: int  1 2 3 4 5 1 2 3 4 5

#Converting structure of dat1 to match da2 structure
dat1<-within(dat1,{Var1<-as.numeric(as.character(Var1))
Var3<-as.integer(Var3)})

```

1 Jul 2012 23:09

### Re: list to dataframe conversion-testing for identical

```Yes it does have something to do with the representation of floating point
numbers. Using cbind() forces the list to become a matrix and that forces
all of the data to become character strings since one of the list elements
is character:

> set.seed(42)
> listdat1<-list(rnorm(10,20),rep(LETTERS[1:2],5),rep(1:5,2))
> str(do.call("cbind", listdat1))
chr [1:10, 1:3] "21.3709584471467" "19.4353018286039" ...
Then you convert that to a data.frame. The default in data.frame() is to
convert characters to factors so you get

> str(data.frame(do.call("cbind",listdat1)))
'data.frame':   10 obs. of  3 variables:
\$ X1: Factor w/ 10 levels "19.4353018286039",..: 8 1 5 7 6 2 9 3 10 4
\$ X2: Factor w/ 2 levels "A","B": 1 2 1 2 1 2 1 2 1 2
\$ X3: Factor w/ 5 levels "1","2","3","4",..: 1 2 3 4 5 1 2 3 4 5

With dat2 you used data.frame() so the numeric fields were not converted to
strings and then factors. Then you converted the dat1 factors back to
numeric. You would be fine with just

> dat1 <- data.frame(listdat1)
> colnames(dat1) <- paste0("Var", 1:3)

Or you can name the list elements and then convert

> names(listdat1) <- paste0("Var", 1:3)
> dat1 <- data.frame(listdat1)

```

2 Jul 2012 00:31

### Re: list to dataframe conversion-testing for identical

```
On Jul 1, 2012, at 5:09 PM, David L Carlson wrote:

> Yes it does have something to do with the representation of floating
> point
> numbers. Using cbind() forces the list to become a matrix and that
> forces
> all of the data to become character strings since one of the list
> elements
> is character:
>
>> set.seed(42)
>> listdat1<-list(
>> str(do.call("cbind", listdat1))
> chr [1:10, 1:3] "21.3709584471467" "19.4353018286039" ...
> Then you convert that to a data.frame. The default in data.frame()
> is to
> convert characters to factors so you get
>
>> str(data.frame(do.call("cbind",listdat1)))
> 'data.frame':   10 obs. of  3 variables:
> \$ X1: Factor w/ 10 levels "19.4353018286039",..: 8 1 5 7 6 2 9 3 10 4
> \$ X2: Factor w/ 2 levels "A","B": 1 2 1 2 1 2 1 2 1 2
> \$ X3: Factor w/ 5 levels "1","2","3","4",..: 1 2 3 4 5 1 2 3 4 5

Yes, arun. If the coding had proceeded otherwise a more natural and
expected result might have occurred:

> dat1<-do.call("data.frame",listdat1)
> colnames(dat1)<-c("Var1","Var2","Var3")
```

2 Jul 2012 02:18

### Re: list to dataframe conversion-testing for identical

```HI All,

A.K.

----- Original Message -----
From: David Winsemius <dwinsemius <at> comcast.net>
To: arun <smartpink111 <at> yahoo.com>
Cc: R help <r-help <at> r-project.org>
Sent: Sunday, July 1, 2012 6:31 PM
Subject: Re: [R] list to dataframe conversion-testing for identical

On Jul 1, 2012, at 5:09 PM, David L Carlson wrote:

> Yes it does have something to do with the representation of floating point
> numbers. Using cbind() forces the list to become a matrix and that forces
> all of the data to become character strings since one of the list elements
> is character:
>
>> set.seed(42)
>> listdat1<-list(
>> str(do.call("cbind", listdat1))
> chr [1:10, 1:3] "21.3709584471467" "19.4353018286039" ...
> Then you convert that to a data.frame. The default in data.frame() is to
> convert characters to factors so you get
>
>> str(data.frame(do.call("cbind",listdat1)))
> 'data.frame':   10 obs. of  3 variables:
> \$ X1: Factor w/ 10 levels "19.4353018286039",..: 8 1 5 7 6 2 9 3 10 4
```

2 Jul 2012 04:15

### Re: list to dataframe conversion-testing for identical

```Hi David & Rui,

It must be the floating point representation.
dat1\$Var1<-round(dat1\$Var1)
dat2\$Var1<-round(dat2\$Var1)

identical(dat1,dat2)
[1] TRUE

I knew that "cbind" is not ideal for converting to dataframe.  But, I used it to understand the differences.

Thanks again,

A.K.

----- Original Message -----
From: David L Carlson <dcarlson <at> tamu.edu>
To: 'arun' <smartpink111 <at> yahoo.com>; 'R help' <r-help <at> r-project.org>
Cc:
Sent: Sunday, July 1, 2012 5:09 PM
Subject: RE: [R] list to dataframe conversion-testing for identical

Yes it does have something to do with the representation of floating point
numbers. Using cbind() forces the list to become a matrix and that forces
all of the data to become character strings since one of the list elements
is character:

> set.seed(42)
> listdat1<-list(rnorm(10,20),rep(LETTERS[1:2],5),rep(1:5,2))
> str(do.call("cbind", listdat1))
```

1 Jul 2012 23:13

### Re: list to dataframe conversion-testing for identical

```Hello,

But

> all.equal(dat1,dat2)
[1] TRUE

So I guess it does have to do with floating-point equality, all.equal
uses .Machine\$double.eps. (Which could return FALSE on ocasions we would
expect TRUE, when, for instance, the tolerance could/should be
.Machine\$double.eps^0.5.)

Em 01-07-2012 18:55, arun escreveu:
> HI R help,
>
> I was trying to get identical data frame from a list using two methods.
>
> #Suppose my list is:
> listdat1<-list(rnorm(10,20),rep(LETTERS[1:2],5),rep(1:5,2))
> #Creating dataframe using cbind
>
> dat1<-data.frame(do.call("cbind",listdat1))
> colnames(dat1)<-c("Var1","Var2","Var3")
> #Second dataframe conversion
>
> dat2<-data.frame(Var1=listdat1[[1]],Var2=listdat1[[2]],Var3=listdat1[[3]])
>
> #Structure is different in two datasets
```