You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am attempting to use the block wise cache to read from networked drive however I am finding that the cache does not store any blocks.
I believe this is due to the fact that LocalFileOpener does not inherit from AbstractBufferedFile but io.IOBase
I have experimented by modifying LocalFileOpener to use AbstractBufferedFile instead and and it appears to work as expected.
Any help would be greatly apprecitated.
Modified LocalFileOpener:
class LocalFileOpener(AbstractBufferedFile):
def __init__(
self, path, mode, autocommit=True, fs=None, compression=None, **kwargs
):
super().__init__(fs, path, mode, **kwargs)
logger.debug("open file: %s", path)
self.f = None
self.compression = get_compression(path, compression)
# self._open()
def _open(self):
if self.f is None or self.f.closed:
if self.autocommit or "w" not in self.mode:
self.f = open(self.path, mode=self.mode)
if self.compression:
compress = compr[self.compression]
self.f = compress(self.f, mode=self.mode)
else:
# TODO: check if path is writable?
i, name = tempfile.mkstemp()
os.close(i) # we want normal open and normal buffered file
self.temp = name
self.f = open(name, mode=self.mode)
if "w" not in self.mode:
self.size = self.f.seek(0, 2)
self.f.seek(0)
self.f.size = self.size
def _fetch_range(self, start, end):
# probably only used by cached FS
if "r" not in self.mode:
raise ValueError
self._open()
self.f.seek(start)
return self.f.read(end - start)
The text was updated successfully, but these errors were encountered:
Your understanding is correct. Your code looks reasonable, but I worry whether calling seek() for every single read has a performance cost. Additionally, using AbstractBufferedFile would invoke the builtin bytes caching mechanism by default, which very likely is a bad idea when we expect the OS to also be doing readahead smart(er) caching.
Thank you for your response. I also have concerns about the continuous seek. The _fetch_range implementation remains unchanged from the existing LocalFileOpener. Anecdotally, I haven’t noticed a significant performance difference between the two.
I’ve also only focused here on the read implementation here and haven’t explored what it would take to make this work for write operations as well.
I am attempting to use the block wise cache to read from networked drive however I am finding that the cache does not store any blocks.
I believe this is due to the fact that LocalFileOpener does not inherit from AbstractBufferedFile but io.IOBase
I have experimented by modifying LocalFileOpener to use AbstractBufferedFile instead and and it appears to work as expected.
Any help would be greatly apprecitated.
Modified LocalFileOpener:
The text was updated successfully, but these errors were encountered: