chore(security): Updating assert logic #10034

john-bodley · 2020-06-10T21:19:10Z

SUMMARY

This PR updates the security manager assertion logic making it more flexible for deployments to configure their security manager. Currently there exists two types of methods:

can_access_*, e.g. can_access_datasource, which returns a boolean in response to whether the user can access said resource.
assert_*_permission, e.g. assert_datasource_permission, which returns None but raises an exception if the user cannot access said resource.

The issue with this approach is that some methods like access_datasource have all the logic and assert_datasource_permission calls can_access_datasource and raises an exception if the response is False, whereas others like can_access_table have no assert equivalent.

Additional context as to why a user cannot access said resource is lost when the logic resides in the can_access_* and thus the preferred approach is to have the logic in the assert_*_permission method and thus the can_access_* methods (which are mostly just convenience methods) can take the form,

def can_access_datasource(self, datasource) -> bool:
    try:
        self.assert_datasource_permission(datasource)
    except SupersetSecurityException:
        return False

    return True

Note I’ve only translated a few of the can_access_* methods. I think in the future it could make sense that more of these methods could leverage this pattern depending on how deployments with custom security managers would want to proceed.

Secondly the term assert seems strange as one would expect it to raise an AssertionError if the assertion fails, whereas these methods raise a SupersetSecurityException, hence I opted for the requests approach (which uses raise_for_status) to call these methods raise_for_access and thus it seems clearer than an exception may be raised (and thus should be handled if necessary). I used the term access rather than permission for consistency with the can_access_* methods. I also added a wrapper function to various classes so rather than,

query_context = QueryContext(**json.loads(request.form["query_context"]))
security_manager.assert_query_context_permission(query_context)

the logic is now,

query_context = QueryContext(**json.loads(request.form["query_context"]))
query_context.raise_for_access()

Finally rather than having a slew of raise_for_*_access which can add to the confusion (and introduce unnecessary complexity) if a deployment needs to override these, i.e., they intrinsically need to know how these interact, I defined a single raise_for_access method which contains all the logic regardless of the resource type (table, datasource, visualization, etc.). This can be helpful say if a user cannot access to an outright datasource but due to column level security can access the datasource in the context of a visualization (where the columns are defined).

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TEST PLAN

CI and added unit tests.

ADDITIONAL INFORMATION

codecov-commenter · 2020-06-10T21:25:50Z

Codecov Report

Merging #10034 into master will increase coverage by 4.84%.
The diff coverage is 92.15%.

@@            Coverage Diff             @@
##           master   #10034      +/-   ##
==========================================
+ Coverage   64.08%   68.93%   +4.84%     
==========================================
  Files         584      584              
  Lines       31054    31077      +23     
  Branches     3180     3180              
==========================================
+ Hits        19901    21422    +1521     
+ Misses      10975     9546    -1429     
+ Partials      178      109      -69

Flag	Coverage Δ
#cypress	`53.92% <ø> (?)`
#javascript	`59.48% <ø> (ø)`
#python	`67.40% <92.15%> (+0.06%)`	⬆️

Impacted Files	Coverage Δ
superset/viz_sip38.py	`0.00% <0.00%> (ø)`
superset/views/api.py	`66.66% <50.00%> (ø)`
superset/views/core.py	`75.39% <75.00%> (ø)`
superset/charts/api.py	`80.31% <100.00%> (ø)`
superset/common/query_context.py	`79.62% <100.00%> (+0.25%)`	⬆️
superset/connectors/base/models.py	`86.57% <100.00%> (+0.14%)`	⬆️
superset/security/manager.py	`91.17% <100.00%> (+2.13%)`	⬆️
superset/viz.py	`57.11% <100.00%> (+0.05%)`	⬆️
superset/connectors/sqla/models.py	`89.10% <0.00%> (+0.14%)`	⬆️
...rontend/src/SqlLab/components/AceEditorWrapper.tsx	`56.98% <0.00%> (+1.07%)`	⬆️
... and 147 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9532bff...06efb7d. Read the comment docs.

john-bodley · 2020-06-10T21:26:50Z

superset/security/manager.py

-        if not self.datasource_access(datasource):
-            raise SupersetSecurityException(
-                self.get_datasource_access_error_object(datasource),
+        from superset import db


I wonder if there should be any assertion logic to ensure that the right combination of parameters are defined. Note except for database and table these are all mutually exclusive. I did consider having an argument database_and_table which would be an Optional[Tuple["Database, "Table']] but I wasn't sold on the idea and thus opted for the additional docstring context.

An optional approach here would be to define DatasourceMixin which provides a datasource, and add it to BaseViz, QueryContext and Datasource. in this case we could simplify the signature of this method by combining viz, datasource and query_context into one single datasource: DatasourceMixin. The same could be done to query and database (DatabaseMixin). This could help clarify what the diffefent arguments mean.

It's tempting to think of something like use a **kwargs then use a dict to route the named parameters to smaller methods that would implement the security assertion logic. A bit crazy, but has it's upsides, cleaner signature, forced breaking logic into smaller chunks, easy to extend.

Not 100% sure if it's doable,

john-bodley · 2020-06-10T21:27:26Z

superset/views/core.py

@@ -2656,8 +2668,7 @@ def fetch_datasource_metadata(self) -> FlaskResponse:
        if not datasource:
            return json_error_response(DATASOURCE_MISSING_ERR)

-        # Check permission for datasource


Annexed self explanatory comment.

etr2460

one other question: does this handle queries run within SQL Lab too? I'm assume that's what the "datasource" part does, but wanted to confirm

etr2460 · 2020-06-11T03:29:12Z

superset/security/manager.py

-        )
+        try:
+            self.raise_for_access(datasource=datasource)
+        except SupersetSecurityException:


If there's a different exception thrown, then this fails open. Is that secure?

@etr2460 in theory (by construction) the raise_for_access method should only throw a SupersetSecurityException exception. Generally catching scoped (as opposed to general) exceptions is preferred.

john-bodley · 2020-06-11T15:24:28Z

@etr2460 SQL Lab uses the rejected_tables method which calls can_access_datasource (renamed to can_access_table in #10031) which leverages this the raise_for_access method.

john-bodley · 2020-06-15T16:32:11Z

@dpgaspar and @villebro I was wondering whether you had any updated thoughts regarding this approach.

Note I do believe that long term many constructs of the existing security manager (from a data access perspective) will need to change as the constructs of database, schema, datasource access is somewhat regimented especially within the world of row level access (exists within Apache Superset) and column level access (exists at Airbnb in the form of metric level access).

etr2460 · 2020-06-15T16:50:07Z

superset/charts/api.py

@@ -450,7 +450,7 @@ def data(self) -> Response:
        except KeyError:
            return self.response_400(message="Request is incorrect")
        try:
-            security_manager.assert_query_context_permission(query_context)
+            query_context.raise_for_access()


not changed in this PR, but this try/catch seems kinda weird, I would've expected this api to be wrapped in the handle_api_exception decorator that i believe automatically generates the proper json error response from an exception

@etr2460 I think this merely highlights the inconsistencies with how backend errors are handled. This is detailed in SIP-41.

@etr2460 #9529 kind of addresses that, it's a POC for SIP-41, uses flask error handling

etr2460

this looks good, although i'm not sure if @villebro or @dpgaspar still needed to take a look

john-bodley · 2020-06-20T02:47:32Z

superset/security/manager.py

@@ -232,7 +233,8 @@ def can_access_all_databases(self) -> bool:

    def can_access_database(self, database: "Database") -> bool:
        """
-        Return True if the user can access the Superset database, False otherwise.
+        Return True if the user can access all the tables within the Superset database,


I though there was merit in being more explicit about what it means to be able to access a database, schema, etc. It's not overly apparent without the comment.

villebro · 2020-06-20T11:08:10Z

@john-bodley this needs a rebase

john-bodley · 2020-06-21T05:24:53Z

@villebro I've rebased. Note in the second commit I also extended the logic to replace the rejected_tables method. I think this provides clarity/simplicity in the views but possibly adds complexity to the raise_for_access method.

villebro

LGTM. My only concern is the fairly loaded method raise_for_access method, which in current form feels slightly difficult to approach for first-timers, and was slightly difficult to follow with many branching if-statements. I'm not sure how you feel about the mixin approach, but it could help remove some complexity in this method, although would arguably introduce some bloat elsewhere.

villebro · 2020-06-22T05:15:13Z

superset/security/manager.py

-        if not self.datasource_access(datasource):
-            raise SupersetSecurityException(
-                self.get_datasource_access_error_object(datasource),
+        from superset import db


An optional approach here would be to define DatasourceMixin which provides a datasource, and add it to BaseViz, QueryContext and Datasource. in this case we could simplify the signature of this method by combining viz, datasource and query_context into one single datasource: DatasourceMixin. The same could be done to query and database (DatabaseMixin). This could help clarify what the diffefent arguments mean.

john-bodley · 2020-06-22T15:30:29Z

@villebro I agree that the raise_for_access method is fairly loaded, though in general I like the standardization it has provided in terms of checking for access from the perspective of a query, viz, etc. It also has reduced a significant amount of the spaghetti security logic, i.e., pervious assert_datasource_permission , can_access_datasource, can_access_table, rejected_tables, etc. whereas now there's just the raise_for_access and a convenience method.

The mixin is an interesting idea, though I sense it mightn’t simply the logic in raise_for_access as one would need to do a number of isinstance checks to work out whether the DatasourceMixin is a query context, viz, etc to pull out the appropriate attributes. Also this doesn’t address the issue of the Table argument as there’s no mixin related to that.

dpgaspar · 2020-06-22T16:13:10Z

@etr2460, @john-bodley

I'm aware of this change, will take a look today

dpgaspar · 2020-06-22T18:00:38Z

superset/security/manager.py

-        if not self.datasource_access(datasource):
-            raise SupersetSecurityException(
-                self.get_datasource_access_error_object(datasource),
+        from superset import db


It's tempting to think of something like use a **kwargs then use a dict to route the named parameters to smaller methods that would implement the security assertion logic. A bit crazy, but has it's upsides, cleaner signature, forced breaking logic into smaller chunks, easy to extend.

Not 100% sure if it's doable,

dpgaspar · 2020-06-22T18:05:33Z

superset/security/manager.py

+
+                if not (schema_perm and self.can_access("schema_access", schema_perm)):
+                    datasources = SqlaTable.query_datasources_by_name(
+                        db.session, database, table_.table, schema=table_.schema


Regarding the from superset import db
We could replace it by self.get_session

* chore(security): Updating assert logic * Deprecating rejected_tables Co-authored-by: John Bodley <[email protected]> (cherry picked from commit aefef9c)

* chore(security): Updating assert logic * Deprecating rejected_tables Co-authored-by: John Bodley <[email protected]>

pull-request-size bot added the size/L label Jun 10, 2020

john-bodley force-pushed the john-bodley--security-manager-assert branch from 9f9c526 to 4eb4e21 Compare June 10, 2020 21:22

john-bodley requested review from villebro, etr2460 and bkyryliuk June 10, 2020 21:23

john-bodley commented Jun 10, 2020

View reviewed changes

john-bodley marked this pull request as ready for review June 10, 2020 21:27

john-bodley force-pushed the john-bodley--security-manager-assert branch 3 times, most recently from 94a0fc7 to 7c05667 Compare June 11, 2020 01:25

etr2460 reviewed Jun 11, 2020

View reviewed changes

john-bodley force-pushed the john-bodley--security-manager-assert branch from 7c05667 to 06efb7d Compare June 11, 2020 22:16

john-bodley requested a review from dpgaspar June 15, 2020 16:31

etr2460 reviewed Jun 15, 2020

View reviewed changes

etr2460 approved these changes Jun 15, 2020

View reviewed changes

etr2460 added the v0.37 label Jun 19, 2020

john-bodley commented Jun 20, 2020

View reviewed changes

john-bodley force-pushed the john-bodley--security-manager-assert branch 4 times, most recently from b4ae299 to 997f1fd Compare June 20, 2020 06:05

john-bodley force-pushed the john-bodley--security-manager-assert branch from 997f1fd to 3c5f3ef Compare June 21, 2020 05:21

john-bodley force-pushed the john-bodley--security-manager-assert branch 2 times, most recently from 634943c to 95fe0ec Compare June 21, 2020 05:39

chore(security): Updating assert logic

e742714

john-bodley force-pushed the john-bodley--security-manager-assert branch 2 times, most recently from 1aef994 to bfb0965 Compare June 21, 2020 21:58

villebro approved these changes Jun 22, 2020

View reviewed changes

dpgaspar approved these changes Jun 22, 2020

View reviewed changes

Deprecating rejected_tables

24cf5fd

john-bodley force-pushed the john-bodley--security-manager-assert branch from bfb0965 to 24cf5fd Compare June 22, 2020 20:31

bkyryliuk approved these changes Jun 22, 2020

View reviewed changes

john-bodley merged commit aefef9c into apache:master Jun 24, 2020

john-bodley deleted the john-bodley--security-manager-assert branch June 24, 2020 03:49

john-bodley mentioned this pull request Jun 24, 2020

refactor: Using self.get_session in security manager #10146

Merged

6 tasks

john-bodley mentioned this pull request Jun 24, 2020

chore: Updating UPDATING.md #10155

Merged

6 tasks

auxten pushed a commit to auxten/incubator-superset that referenced this pull request Nov 20, 2020

chore(security): Updating assert logic (apache#10034)

2f674b5

* chore(security): Updating assert logic * Deprecating rejected_tables Co-authored-by: John Bodley <[email protected]>

mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.37.0 labels Mar 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(security): Updating assert logic #10034

chore(security): Updating assert logic #10034

john-bodley commented Jun 10, 2020 •

edited

Loading

codecov-commenter commented Jun 10, 2020 •

edited

Loading

john-bodley Jun 10, 2020

villebro Jun 22, 2020

dpgaspar Jun 22, 2020

john-bodley Jun 10, 2020

etr2460 left a comment

etr2460 Jun 11, 2020

john-bodley Jun 11, 2020

john-bodley commented Jun 11, 2020 •

edited

Loading

john-bodley commented Jun 15, 2020 •

edited

Loading

etr2460 Jun 15, 2020

john-bodley Jun 15, 2020

dpgaspar Jun 22, 2020 •

edited

Loading

etr2460 left a comment

john-bodley Jun 20, 2020

villebro commented Jun 20, 2020

john-bodley commented Jun 21, 2020 •

edited

Loading

villebro left a comment

villebro Jun 22, 2020

john-bodley commented Jun 22, 2020 •

edited

Loading

dpgaspar commented Jun 22, 2020

dpgaspar Jun 22, 2020

dpgaspar Jun 22, 2020

chore(security): Updating assert logic #10034

chore(security): Updating assert logic #10034

Conversation

john-bodley commented Jun 10, 2020 • edited Loading

SUMMARY

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TEST PLAN

ADDITIONAL INFORMATION

codecov-commenter commented Jun 10, 2020 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

etr2460 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

john-bodley commented Jun 11, 2020 • edited Loading

john-bodley commented Jun 15, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dpgaspar Jun 22, 2020 • edited Loading

Choose a reason for hiding this comment

etr2460 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

villebro commented Jun 20, 2020

john-bodley commented Jun 21, 2020 • edited Loading

villebro left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

john-bodley commented Jun 22, 2020 • edited Loading

dpgaspar commented Jun 22, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

john-bodley commented Jun 10, 2020 •

edited

Loading

codecov-commenter commented Jun 10, 2020 •

edited

Loading

john-bodley commented Jun 11, 2020 •

edited

Loading

john-bodley commented Jun 15, 2020 •

edited

Loading

dpgaspar Jun 22, 2020 •

edited

Loading

john-bodley commented Jun 21, 2020 •

edited

Loading

john-bodley commented Jun 22, 2020 •

edited

Loading