Move to adaptor backend #298

nathanjmcdougall · 2024-08-13T10:26:53Z

To set us up for #263 and #254.

At @machow's suggestion here: #263 (comment)
I definitely agree this is definitely a nicer way of doing things.

pins/boards.py

nathanjmcdougall · 2024-08-13T10:57:34Z

pins/tests/_databackend/LICENSE

@@ -0,0 +1,21 @@
+MIT License


Technically complying with MIT requires distributing the license

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

Actually this probably needs to be moved out of tests because I'm not sure if that gets included in the wheel build (I think it doesn't).

The tests directory does get included in the wheel build! I'm not entirely sure where this file should be, but it might be easier to not vendor in this package, if possible 😄

pins/_adaptors.py

+from abc import abstractmethod
+from typing import TYPE_CHECKING, Any, ClassVar, TypeAlias, overload
+
+from typing_extensions import Self


isabelizimm · 2024-08-20T21:46:35Z

I would prefer to not vendor in all of databackend; @machow, wdyt of putting the latest GH release on PyPI? It looks like there is just v0.0.1 on PyPI due to some sort of auth-y error in CI releasing 0.0.2? We can maybe pair through this together?

isabelizimm · 2024-08-27T21:49:00Z

Okay, we are able to use databackend==0.0.3 now, rather than vendoring it all in 🎉

Various other type improvements

nathanjmcdougall · 2024-08-27T22:42:23Z

Cool, I've rebased away the commit with the vendored code and added the new dependency.

isabelizimm · 2024-12-16T23:52:50Z

paired with @machow, he'll be taking over this PR and associated ones (#254 and #263) for review/next steps!

nathanjmcdougall · 2025-03-25T22:27:33Z

@machow hey, it would be good if you could review this PR sometime. No great rush but I'd really love to get #249.

machow

I'm so sorry for the long wait 😓 -- this is looking really good! I think the 3 small things I thought could benefit from a little tweaking are..

It seems like _Adaptor._d should not be typed as a class variable
_Adaptor.write_json() is overriden in _DfAdaptor in a surprising way (it returns a string), in order to work for data previews. I wonder if the _DFAdaptor version could be renamed to .to_json() to reflect its doing a different job.
save_data() is fed an object pulled from _Adaptor._d but then re-creates the adaptor. Could we allow save_data() to take an adaptor, to prevent this round tripping?

Thanks again for all the work you put into this! If it's useful, I'm happy to pick this up and finish -- especially since so much time has passed, and I've picked up the context again / probably owe it to folks 😭

machow · 2025-04-01T17:27:48Z

pins/_adaptors.py

+    _DataFrame: TypeAlias = _PandasDataFrame
+
+
+class _AbstractPandasFrame(AbstractBackend):


Since the file _adaptors.py starts with an underscore, it seems okay for the contents to not use an underscore (e.g. AbstractPandasFrame).

(though also totally okay to punt this, since it's all internal; especially if other PRs are building on this one)

Yeah I agree with that.

machow · 2025-04-01T17:37:50Z

pins/_adaptors.py

+
+
+class _Adaptor:
+    _d: ClassVar[Any]


Is the use of ClassVar right here? It seems like _d is not a class variable (it's set on the instance).

Yeah also agreed. I can't recall why I did that, it might have been some misguided thoughts about pyright behaviour.

machow · 2025-04-01T17:54:02Z

pins/_adaptors.py

+        self._d = data
+
+    @overload
+    def write_json(self, file: str) -> None: ...


This behavior was modified to sometimes return a string, which seems to violate command-query separation (maybe to reflect the implementation in the pandas adaptor?). Can we revert so that the adaptor refactors, but does not extend the original behavior?

edit: see comment in _DFAdaptor.write_json()

Totally, good idea.

machow · 2025-04-01T18:00:30Z

pins/_adaptors.py

+    def write_json(self, file: str) -> None: ...
+    @overload
+    def write_json(self, file: None) -> str: ...
+    def write_json(self, file: str | None = None) -> str | None:


Okay I think I understand -- this looks like an override of the original .write_json() method, but its job seems different (it's used for the data preview). Its job looks more like the added .shape or .columns properties.

Can we do this?:

rename .write_json() here to something else (maybe .to_json())

change the parent annotation for write_json() to always return None (not sometimes a str)

See what I've done in this commit:

daa4239

machow · 2025-04-01T18:09:55Z

pins/drivers.py

    #       as argument to board, and then type dispatchers for explicit cases
    #       of saving / loading objects different ways.

+    adaptor = _create_adaptor(obj)


Could we allow obj to be either _Adaptor | Any (the typing is redundant, but maybe a helpful signal)? Then, if obj is not an adaptor, this line could create one.

I'm guessing keeping the original save_data behavior is useful for testing, but it'd be nice not to have to roundtrip by calling it on save_data(_Adaptor._d, ...) in this board method:

https://github.com/rstudio/pins-python/pull/298/files#diff-36792b1eedbe5453d2c6b58286ab65eed0c9286e94e2695a736edff37164e885R757

I see you've done that here:

d0fa9c9

I will:

Write a test case for this

Refactor to take advantage of this

pins/drivers.py

nathanjmcdougall · 2025-04-01T19:06:55Z

I'll take a look at this later today - but I'm happy for you to pick it up if you prefer. Thank you for taking the time to revisit the project :)

machow · 2025-04-01T19:53:24Z

I can take a quick pass right now, since I've got the basic pins stuff in mind!

…nathanjmcdougall/pins-python into feature/move-to-adaptor-backend

machow · 2025-04-01T23:05:05Z

Ah, we're both pushing to this PR -- I'll leave it to you, thanks for working on this! I noticed that a doctest test fails now, and seems to be related to a more recent release of fsspec.

I'm guessing the ffspec github backend doesn't like the two // in this path anymore: pins/tests/pins-compat/df_csv/20220214T163720Z-9bfad//df_csv.csv

If it's useful, we can always ignore that for now, and handle as a separate from this PR

nathanjmcdougall · 2025-04-01T23:13:53Z

Sorry yeah unfortunately I think we both started working on it at the same time.

Sounds good - I'm in favour of fixing it in another PR first so after I've addressed the issues here I'll work on that.

machow

I only took a quick glance over the most recent changes -- but this LGTM (isabel feel free to look closer!)

isabelizimm

This looks good to me! I added in typing_extensions as a dependency, which can be dropped after 3.9 isn't supported.

This has been a looong time coming, thank you both for helping get pins into a better place for future dataframe support!

nathanjmcdougall mentioned this pull request Aug 13, 2024

Feature/153 support polars in pin_write to parquet #263

Open

nathanjmcdougall commented Aug 13, 2024

View reviewed changes

pins/boards.py Show resolved Hide resolved

nathanjmcdougall commented Aug 13, 2024

View reviewed changes

nathanjmcdougall commented Aug 20, 2024

View reviewed changes

pins/_adaptors.py Outdated

from abc import abstractmethod

from typing import TYPE_CHECKING, Any, ClassVar, TypeAlias, overload

from typing_extensions import Self

This comment was marked as resolved.

Sign in to view

nathanjmcdougall force-pushed the feature/move-to-adaptor-backend branch from 43ebacc to 549679a Compare August 20, 2024 22:20

nathanjmcdougall added 11 commits August 28, 2024 10:27

Support adaptor in prepare_pin_version

9dd8beb

Use adaptor in save_data

040da5e

Use adaptor for default_title

4ba393d

underscore prefix for _adaptors.py; abstracting df_type in default_title

7898ce7

Removing duplication in _obj_name definition

4a3ea01

Use adaptor in _create_meta

007ad3a

Pass pyright

d577b02

Fix broken import

3aaabbb

Refactoring type hints to avoid use of Self

56c3285

Various other type improvements

Remove singleton Union

0171d72

Add databackend as a dependency

fe6092f

nathanjmcdougall force-pushed the feature/move-to-adaptor-backend branch from 90355be to fe6092f Compare August 27, 2024 22:40

isabelizimm requested a review from machow December 16, 2024 22:59

Merge branch 'main' into feature/move-to-adaptor-backend

1289134

machow requested changes Apr 1, 2025

View reviewed changes

dev: add ruff to pyproject.toml

1d5c47f

machow mentioned this pull request Apr 1, 2025

Decide on linting rules to enable for project #323

Open

feat: allow save_data to accept an Adaptor

d0fa9c9

nathanjmcdougall added 4 commits April 2, 2025 11:39

Remove unnecessary underscores

81f6779

Remove misleading/unnecessary ClassVar declaration

1540500

Merge branch 'feature/move-to-adaptor-backend' of https://github.com/…

dd49569

…nathanjmcdougall/pins-python into feature/move-to-adaptor-backend

Separate write_json from to_json (CQS)

daa4239

nathanjmcdougall added 2 commits April 2, 2025 12:25

Move calls to create_adapter to hide them at a lower level

f11141a

Add some tests

13d356e

nathanjmcdougall mentioned this pull request Apr 1, 2025

Potential fsspec issue with double forwardslash #324

Closed

isabelizimm mentioned this pull request Apr 29, 2025

maint: remove extra / in doctest #330

Merged

Merge branch 'rstudio:main' into feature/move-to-adaptor-backend

82ba58a

machow approved these changes May 29, 2025

View reviewed changes

nathanjmcdougall and others added 2 commits June 4, 2025 11:55

Use backported typing_extensions.TypeAlias for Python 3.9

18818f6

add typing_extensions

dc683dd

isabelizimm approved these changes Jun 4, 2025

View reviewed changes

isabelizimm merged commit ef9f358 into rstudio:main Jun 4, 2025
27 checks passed

		_DataFrame: TypeAlias = _PandasDataFrame


		class _AbstractPandasFrame(AbstractBackend):

Move to adaptor backend #298

Move to adaptor backend #298

Uh oh!

Conversation

nathanjmcdougall commented Aug 13, 2024

Uh oh!

Uh oh!

nathanjmcdougall Aug 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nathanjmcdougall Aug 20, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

This comment was marked as resolved.

Uh oh!

isabelizimm commented Aug 20, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

isabelizimm commented Aug 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nathanjmcdougall commented Aug 27, 2024

Uh oh!

isabelizimm commented Dec 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nathanjmcdougall commented Mar 25, 2025

Uh oh!

machow left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

nathanjmcdougall commented Apr 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

machow commented Apr 1, 2025

Uh oh!

machow commented Apr 1, 2025

Uh oh!

nathanjmcdougall commented Apr 1, 2025

Uh oh!

machow left a comment

Choose a reason for hiding this comment

Uh oh!

isabelizimm left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

nathanjmcdougall Aug 13, 2024 •

edited

Loading

nathanjmcdougall Aug 20, 2024 •

edited

Loading

isabelizimm commented Aug 20, 2024 •

edited

Loading

isabelizimm commented Aug 27, 2024 •

edited

Loading

isabelizimm commented Dec 16, 2024 •

edited

Loading

nathanjmcdougall commented Apr 1, 2025 •

edited

Loading

isabelizimm left a comment •

edited

Loading