Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify range for xlsx_cells() #25

Open
SteveBronder opened this issue Apr 19, 2018 · 7 comments
Open

Specify range for xlsx_cells() #25

SteveBronder opened this issue Apr 19, 2018 · 7 comments

Comments

@SteveBronder
Copy link

I have a large excel file that causes xlsx_cells to crash. It would be nice to say, "Only get this many rows and this many columns" when calling xlsx_cells.

Could something be put in xlsxsheet::parseSheetData?

@nacnudus
Copy link
Owner

Hi, thanks for the suggestion. If you're thinking of something like readxl's range argument, then I agree, that would be good.

Presumably you've already tried reading one sheet at a time with the sheets argument?

@SteveBronder
Copy link
Author

Yes I have. My problem is that the Workbook is 67 MB in size (yes yikes!). Calling an individual sheet still causes xlsx_cells() to crash.

@SteveBronder
Copy link
Author

Actually I've found that the person who made these excel files dragged the formatting down to the last possible row over a bunch of columns. So my actual issue is that for each sheet xlsx_cells() is trying to parse a ton of rows that only have formatting. So excel size is not really the issue, but having n_cols and n_rows arguments would be rad in solving this.

In my particular case I only need the first three rows or so.

@nacnudus
Copy link
Owner

I think a first step is to optionally omit blank cells. When readxl implemented range import it was complicated, and I want to take care to do it as similarly as possible.

@SteveBronder
Copy link
Author

SteveBronder commented Apr 28, 2018 via email

nacnudus added a commit that referenced this issue May 1, 2018
For files that are too big to import because whole columns have been formatted
(but are mostly blank) #25
nacnudus added a commit that referenced this issue May 1, 2018
For files that are too big to import because whole columns have been formatted
(but are mostly blank) #25
@nacnudus
Copy link
Owner

nacnudus commented May 1, 2018

@SteveBronder blank cells can now be excluded on the master branch.

xlsx_cells(x, include_blank_cells = FALSE)

I'll keep this issue open for the range feature.

@nacnudus nacnudus changed the title [feature request] add n_max to xlsx_cells [feature request] specify range for xlsx_cells() May 1, 2018
@nacnudus nacnudus changed the title [feature request] specify range for xlsx_cells() Specify range for xlsx_cells() May 1, 2018
@SteveBronder
Copy link
Author

Ty so much!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants