-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Easier withColumn method #453
Comments
Hey @MrPowers, sorry I missed this comment. I hear what you are saying. I definitely see this being easier, but unfortunately, this is nearly impossible to do. When you add a new column to |
@MrPowers I think I have a better answer for you. Say you have
As you see, your schema became from |
@imarios - Thanks for the detailed responses. I started brainstorming the idea of typed columns, see here, and think this idea might be useful for frameless as well. Columns are untyped and that's a big reason why Spark is so type unsafe. Typed columns can help us catch a lot more errors at compile time. When we run We could add a |
Great work on this lib! It's a great way to write Spark code!
As discussed here and in the docs,
withColumn
requires a full schema when a column is added.Here's the example in the docs:
Couldn't we just assume that the schema stays the same for the existing columns and only supply the schema for the column that's being added?
I think this'd be a lot more use friendly. I'm often dealing with schemas that have tons of columns and add lots of columns with
withColumn
. Let me know your thoughts!The text was updated successfully, but these errors were encountered: