You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current implementation of close() in AbstractBufferedFile is like the following.
def close(self):
"""Close file
Finalizes writes, discards cache
"""
if getattr(self, "_unclosable", False):
return
if self.closed:
return
if self.mode == "rb":
self.cache = None
else:
if not self.forced:
self.flush(force=True) # <--- can throw exceptions
if self.fs is not None:
self.fs.invalidate_cache(self.path)
self.fs.invalidate_cache(self.fs._parent(self.path))
self.closed = True # <--- we always need to reach here.
This method calls flush() to finalize file writing. However, the file will continue to be considered as "open" in case of exceptions in flush() as the code does not reach self.closed = True.
If close() is called again, the file object will try to flush data again. This can lead to unexpected side effects, and it does not satisfy the convention defined by IOBase.close().
Flush and close this stream. This method has no effect if the file is already closed. Once the file is closed, any operation on the file (e.g. reading or writing) will raise a ValueError.
As a convenience, it is allowed to call this method more than once; only the first call, however, will have an effect. https://docs.python.org/3/library/io.html#io.IOBase.close
As I reported in #1685, the garbage collection also calls this close(). Therefore, even if we explicitly close a file object once, its close() is called at least twice if flush() throws an exception. In addition, the file close() function's idempotence is generally an expected characteristic in many programming languages.
To make the behaviour of file objects simpler and more predictable in case of errors, it would be better always to mark self.closed = True regardless the result of flush().
The current implementation of
close()
inAbstractBufferedFile
is like the following.https://github.com/fsspec/filesystem_spec/blob/2024.9.0/fsspec/spec.py#L2022-L2041
This method calls
flush()
to finalize file writing. However, the file will continue to be considered as "open" in case of exceptions inflush()
as the code does not reachself.closed = True
.If
close()
is called again, the file object will try to flush data again. This can lead to unexpected side effects, and it does not satisfy the convention defined byIOBase.close()
.As I reported in #1685, the garbage collection also calls this
close()
. Therefore, even if we explicitly close a file object once, itsclose()
is called at least twice ifflush()
throws an exception. In addition, the fileclose()
function's idempotence is generally an expected characteristic in many programming languages.To make the behaviour of file objects simpler and more predictable in case of errors, it would be better always to mark
self.closed = True
regardless the result offlush()
.The text was updated successfully, but these errors were encountered: