Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow content to expire #664

Open
wants to merge 1 commit into
base: asf-site
Choose a base branch
from

Conversation

lindong28
Copy link
Member

@lindong28 lindong28 commented Jul 19, 2023

The purpose of this PR is to allow expired content on the nightly build Flink website to expire.

Currently, flink-web does not explicitly specify ExpiresActive in .htaccess, and therefore content expiration is disabled by default. CDN or users' web browser might still serve outdated content even after the content has been removed.

For example, this Flink ML web link still serves a page that has been deleted from the Flink ML repo even though the URL says "flink-ml-docs-master".

This PR fixes this problem by making the following changes:

  • Enable expiration by setting ExpiresActive on
  • Set text/html typed content to expire/refresh once every hour.
  • Set all other content (e.g. image/jpg) to expire/refresh once every day.

See [1] for a discussion of similar issues and the suggestion by Apache infra team. See [2] for documentation of the HTTP directives added in this PR. See [3] for a detailed explanation of the modification directive.

[1] https://issues.apache.org/jira/browse/INFRA-18519
[2] https://github.com/apache/echarts-website/blob/asf-site/.htaccess
[3] https://stackoverflow.com/questions/562802/cache-expire-control-with-last-modification

@lindong28
Copy link
Member Author

@MartijnVisser Do you have time to review this PR?

@MartijnVisser
Copy link
Contributor

MartijnVisser commented Jul 19, 2023

Do you have time to review this PR?

Sure. I do think that there's a different issue. This .htaccess file is only used on https://flink.apache.org project website, but not for the documentation that's build on https://nightlies.apache.org/flink.

I don't immediately see a workflow for building the flink-ml docs: where is that done?
Edit: I see tools/ci/docs.sh but I don't see any workflow triggering that?

@lindong28
Copy link
Member Author

@MartijnVisser Thanks for the comments.

flink-ml docs is built by this script https://github.com/apache/infrastructure-bb2/blob/master/flink-ml.py. This script is executed every day by a build bot whose status can be found by searching "flink ml" at https://ci2.apache.org/#/builders.

If you are also not sure where to find/update .htaccess for https://nightlies.apache.org/flink, do you know who might know the answer? If none of us know, maybe I should create a JIRA for the Apache infra team.

@MartijnVisser
Copy link
Contributor

flink-ml docs is built by this script https://github.com/apache/infrastructure-bb2/blob/master/flink-ml.py. This script is executed every day by a build bot whose status can be found by searching "flink ml" at https://ci2.apache.org/#/builders.

@lindong28 I'm wondering if there's something wrong in the rsync step, that causes the file that serves https://nightlies.apache.org/flink/flink-ml-docs-master/docs/try-flink-ml/quick-start/ not to be removed. I think it's best to file a Jira for it. For the Flink repo, we've moved away from buildbot to https://github.com/apache/flink/blob/master/.github/workflows/docs.yml

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants