initializing a workbook

@richfitz 

From my [onion-peeling adventure](https://github.com/rsheets/rexcel/pull/7), I gather that if I read a worksheet via `rexcel_read()`, I drop into `rexcel_read_workbook()` and then into `rexcel_read_worksheet()`. At least, those are the exported functions called. It feels like there's one more layer or one more function that necessary? **Q1: Can you help me understand the role of `rexcel_read()`?** I think it's the one whose purpose isn't clear.

In `googlesheets`, for better or worse, there's an explicit registration step, that creates an R object with metadata about a Google Sheet. Only with that in hand can you start reading stuff back out of it. With Google Sheets, this is practically a requirement vs. a voluntary design decision. But would a similar workflow make sense for `rexcel`?

I think I'm proposing that most of what's in `rexcel_read_workbook()` get moved into a workbook "registration" function. So that it's possible to get set up to read a workbook w/o actually diving down into any worksheets (currently not possible, I believe?).

I also think (correct me) that current reading functions leave behind little to no info for worksheets that weren't specifically requested. Again, for a Google Sheet, when I register it, I create an overview of all worksheets (name and extent,mostly). When I think about us characterizing the Enron corpus, it would be nice to be able to register each workbook (15K) and get high-level info on the worksheets (80K) w/o necessarily reading their cells.

**Q2: what do you think of a registration-based workflow?**

**Q3: what do you think of marshalling more data about worksheets at registration / workbook creation time?** It creates an intermediate between practically no info and full reading of cells, etc.

Finally, it seems like one can return a `linen::worksheet` (`rexcel_read_worksheet()` does) and I wonder what that even means. Early on, the student who worked with me on `googlesheets` also allowed direct access to worksheets and this caused trouble. Technically, it was a problem because she implemented it in a way that ran up against some of `XML`s worst gotchas re: memory leakage. But conceptually it was also tricky. A worksheet can't exist outside a workbook, so you were always dragging around host workbook info anyway. So we implemented a policy where you either interacted with the object that comes from registering a sheet or with data coming out of the sheet. But there was no user-facing tangible notion of anything in between. I know our situation is different (R6 class, local xlsx, etc.) but still ....

**Q4: what's the deal with worksheet objects?** This question is kinda vague. Sorry.

_let me know if we should just Skype for some/all of these_


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

initializing a workbook #8

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

initializing a workbook #8

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions