Skip to content

ff() fails when the product of dim is too large to cast to an integer #3

@khughitt

Description

@khughitt

Greetings!

I'm attempting to use ff() via the bigcor() function in https://github.com/anspiess/propagate.

When the input matrix size exceeds a certain limit, however, ff() fails with an error:

Error in if (length < 0 || length > .Machine$integer.max) stop("length must be between 0 and .Machine$integer.max") : 
  missing value where TRUE/FALSE needed
Calls: ff
In addition: Warning message:
In ff(vmode = "double", dim = c(num_cols, num_cols)) :
  NAs introduced by coercion to integer range
Execution halted

I tracked down the issue to ff.R:2465:

n <- as.integer(prod(dim))

When dim is too large (in my case, ~4.65e4 or larger), the product of the dimensions is too large, leading to an NA value after being cast with as.integer():

r$> prod(c(4.65e4, 4.65e4))                                                                                                             
[1] 2162250000

r$> as.integer(prod(c(4.65e4, 4.65e4)))                                                                                                 
[1] NA
Warning message:
NAs introduced by coercion to integer range 

r$> .Machine$integer.max                                                                                                                
[1] 2147483647

Do you know if there is any way around this?

Otherwise, perhaps it would be worth performing a check against this early on and letting the user know that ff() cannot proceed?

Related downstream issue: anspiess/propagate#4

System Information

Attaching package ff
- getOption("fftempdir")=="/tmp/Rtmp5BMuOZ/ff"

- getOption("ffextension")=="ff"

- getOption("ffdrop")==TRUE

- getOption("fffinonexit")==TRUE

- getOption("ffpagesize")==65536

- getOption("ffcaching")=="mmnoflush"  -- consider "ffeachflush" if your system stalls on large writes

- getOption("ffbatchbytes")==16777216 -- consider a different value for tuning your system

- getOption("ffmaxbytes")==536870912 -- consider a different value for tuning your system


Attaching package: ‘ff’

The following objects are masked from ‘package:utils’:

    write.csv, write.csv2

The following objects are masked from ‘package:base’:

    is.factor, is.ordered

R version 4.0.2 (2020-06-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Arch Linux

Matrix products: default
BLAS:   /usr/lib/libopenblasp-r0.3.10.so
LAPACK: /usr/lib/liblapack.so.3.9.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8       
 [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] ff_4.0.2  bit_4.0.4

loaded via a namespace (and not attached):
[1] compiler_4.0.2  parallel_4.0.2  RJSONIO_1.3-1.4

Cheers,
Keith

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions