analogous to "TEXT_EXTRACTED" tag when extracting text and tags for tokenization and removal of bibliographies.
Without these PDF tags, only preprocessed input files can be identified using infolisFileTags. Pdf files can be excluded but not included.
Important: the assignment of pdf tags requires one change in TextExtractor: this tag must not be passed on the the created infolisFiles.