-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement mv method in GDriveFileSystem #217
Conversation
@simone-viozzi this is from the docs:
Does it make sense to use the Upload for this? E.g. like this https://docs.iterative.ai/PyDrive2/filemanagement/?highlight=metadata#update-file-metadata |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to review the workflow around Upload()
Right now is possible to rename a file using existing methods: file.FetchMetadata(fields="title")
file["title"] = "new_file1"
file.Upload() and looking at the code for Lines 499 to 512 in 64a50a1
it uses i discovered this way to rename a file with #91 but it would be much more convenient and user friendly to just have a method that does that |
@simone-viozzi could you check if fsspec interface had one more question:
is it possible to do this w/o a call to Fetch? (if I know the file id). I wonder also if it's possible to do this if I know only a file name w/o any extra calls. That would be a good benefit of this I think. |
No, without fetch it doesn't do nothing. Consider the minimal example: file = drive.CreateFile({'id': "13nYjSBX7FmSUFnBcTuT6hKsFswXhqcpzpioei5r2eRQ"})
file["title"] = "new_file1"
file.Upload() The stack-trace of
Lines 92 to 103 in 64a50a1
i think because the title was With the rename method i implemented: you don't need to fetch it before because i specified it in the request:
there should be a move method (#119) but i can't find it on the current version. Also there is a reference to it in one of the tests: PyDrive2/pydrive2/test/test_fs.py Lines 90 to 102 in 64a50a1
i suppose it's an old test and the method was dropped somewhere. Anyway, there is no |
yep, thanks. I remember it now. I was trying to fix it, but it was not that trivial. Ideally, it should be fixed though.
https://filesystem-spec.readthedocs.io/en/latest/api.html#fsspec.spec.AbstractFileSystem.rename That's the default implementation of https://github.com/fsspec/filesystem_spec/blob/master/fsspec/spec.py#L917-L920 That's the https://github.com/fsspec/filesystem_spec/blob/master/fsspec/spec.py#L1251 It's not terrible (if we optimize So, ideally, we can take a look and see if we can fix the #91 + optimize fsspec to provide a better interface for this. Then we can see if we need a separate wrapper on top of Let me know wdyt. |
oh, i get it now, thank you! i think we can definitively optimize them, maybe for the rename we could use otherwise we can do just a function on the def rename(path, new_title)
id = _get_item_id(path)
file = drive.CreateFile({'id': id})
file.FetchMetadata(fields="title")
file["title"] = new_title
file.Upload() But the current implementation of rename is: So maybe the best is to change the implementation of move in a way that:
This way we can leave the default implementation of rename. The move method will become something like: def move(self, path1, path2):
path1_left = path1.split('/')[:-1]
path1_right = path1.split('/')[-1]
path2_left = path1.split('/')[:-1]
path2_right = path1.split('/')[-1]
if path1_left == path2_left:
rename file from path1_right to path2_right
else:
move file from path1 to path2 what do you think? This way even if someone use move to rename a file it would call the right API |
Re moving files, check this https://developers.google.com/drive/api/guides/folder#move_files_between_folders I feel it can be generalized to a single Patch call that updates parents + title if needed, wdyt? The only thing we'll need is to get ids for those parents, for the file, etc. It should be done carefully. The most annoying thing on Google Drive is that one path can resolve into multiple IDs. |
check out: https://rclone.org/commands/rclone_dedupe/ i think for the move / rename method we can just throw an error if a path resolve in multiple IDs.
Good! |
mh, funny. fs.mv('root/tmp/a.pdf', 'root/tmp/b.pdf') but the move part does not. Even if i change the parent array, the Lines 499 to 512 in 64a50a1
I also tried on manually calling the privates methods: if file1_name != file2_name:
file1._FilesPatch()
if file1_parent != file2_parent:
file1._FilesUpdate() But @shcheklein do you have any idea why? if i did the fetch before it should get that the array has changed |
Ok, now it does work. both rename / move and rename and move at the same time. i changed: file1["parents"].append({"id": file2_parent_id}) to: file1["parents"] = [{"id": file2_parent_id}] This way when i don't know if it's ok to just reset the parents array, on a normal drive it's not allowed to have a folder with multiple parents but in some other drives maybe is possible. |
I also discovered that we don't need to use Please take a look at the code, for a single file it should be ready for a review. Than i will see how to implement |
@simone-viozzi thanks! it's a good iteration, I put some comments to improve it.
can you give more context please? |
@shcheklein how you want to handle Regarding |
Hey @shcheklein could you please review this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, some stylistic changes. Good stuff overall. I'm not sure how recursive is handled, also we need to add test, would be great to count the number of api calls in those tests also
pydrive2/fs/spec.py
Outdated
if maxdepth: | ||
raise NotImplementedError("Max depth move is not supported") | ||
|
||
if self.isdir(path1) and not recursive: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: how do we handle recursive after all? Why do we need this extra check. Is it only as a safeguard to check the intention? I'm asking since even the isdir
check can be expensive (extra call).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you move a folder, the content is moved with it. So is kinda always recursive.
The check is there to avoid that a user wanted to move a file (recursive=False
) but by accident the source is a directory.
We could also remove recursive
and always assume that the user want to move the entire folder. This way, there is no need to check if the source is a directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say yes (to avoid that extra API call). Also data does not disappear anywhere, it's not that dangerous.
how does the local filesystem fsspec implementation approach this?
unix mv
doesn't require the recursive flag afair
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did that to not modify the default method signature of ffspec
:
https://github.com/fsspec/filesystem_spec/blob/2633445fc54797c79a7ac96d213bd24dfbdfcdc2/fsspec/spec.py#L917-L920
In this case, the flags are passed down to copy.
But yes, if it's allowed to change the signature, we can also remove recursive
and the isDir
check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would just ignore it. We are not allowed to change the default.
We can also check still how they implement the local fs. Is it unix way or it check the recursive flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can also check still how they implement the local fs. Is it unix way or it check the recursive flag.
Where?
@shcheklein there is already a test: PyDrive2/pydrive2/test/test_fs.py Line 90 in 688459b
|
Q: How did it work before if mv was not implemented? |
@shcheklein regarding the recursive behavior, there is no actual need of recursion. So it is possible to move an entire folder structure with just one call. |
Before |
okay, let's remove / ignore it for now |
I removed the |
Thanks @simone-viozzi ! |
Hi, i'm addind a method to rename a
GoogleDriveFile
the google api to rename a file is: https://developers.google.com/drive/api/v2/reference/files/patch
I saw you have a wrapper for
files().patch
:_FilesPatch
do you prefer if I use the existing wrapper or the api directly?